Search Results: "kov"

15 October 2020

Dirk Eddelbuettel: dang 0.0.12: Two new functions

A new release of the dang package is now on CRAN, roughly one year after the last release. The dang package regroups a few functions of mine that had no other home as for example lsos() from a StackOverflow question from 2009 (!!) is one, this overbought/oversold price band plotter from an older blog post is another. More recently added were helpers for data.table to xts conversion and a git repo root finder. This release adds two functions. One was mentioned just days ago in a tweet by Nathan and is a reworked version of something Colin tweeted about a few weeks ago: a little data wrangling off the kewl rtweet to find maximally spammy accounts per search topic. In other words those who include more than N hashtags for given search term. The other is something I, if memory serves, picked up a while back on one of the lists: a base R function to identify non-ASCII characters in a file. It is a C function that is not directly exported by and hence no accessible, so we put it here (with credits, of course). I mentioned it yesterday when announcing tidyCpp as I this C function was the starting point for the new tidyCpp wrapper around some C API of R functions. The (very short) NEWS entry follows.

Changes in version 0.0.12 (2020-10-14)
  • New functions muteTweets and checkPackageAsciiCode.
  • Updated CI setup.

Courtesy of CRANberries, there is a comparison to the previous release. For questions or comments use the issue tracker off the GitHub repo. If you like this or other open-source work I do, you can now sponsor me at GitHub. For the first year, GitHub will match your contributions.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

4 October 2020

Sylvain Beucler: git filter-branch and --state-branch - how?

I'm mirroring and reworking a large Git repository with git filter-branch (conversion ETA: 20h), and I was wondering how to use --state-branch which is supposed to speed-up later updates, or split a large conversion in several updates. The documentation is pretty terse, the option can produce weird results (like an identity mapping that breaks all later updates, or calling the expensive tree-filter but discarding the results), wrappers are convoluted, but I got something to work so I'll share :) The main point is: run the initial script and the later updates in the same configuration, which means the target branch needs to be reset to the upstream branch each time, before it's rewritten again by filter-branch. In other words, don't re-run it on the rewritten branch, nor attempt some complex merge/cherry-pick.
git fetch
git branch --no-track -f myrewrite origin/master
git filter-branch \
  --xxx-filter ... \
  --xxx-filter ... \
  --state-branch refs/heads/filter-branch/myrewrite \
  -d /dev/shm/filter-branch/$$ -f \
  myrewrite
Updates restart from scratch but only take a few seconds to skim through all the already-rewritten commits, and maintain a stable history. Note that if the process is interrupted, the state-branch isn't modified, so it's not a stop/resume feature. If you want to split a lenghty conversion, you could simulate multiple upstream updates by checking out successive points in history (e.g. per year using $(git rev-list -1 --before='2020-01-01 00:00:00Z')). --state-branch isn't meant to rewrite in reverse chronological order either, because all commit ids would constantly change. Still, you can rewrite only the recent history for a quick discardable test. Be cautious when using/deleting rewritten branches, especially during early tests, because Git tends to save them to multiple places which may desync (e.g. .git/refs/heads/, .git/logs/refs/, .git/packed-refs). Also remember to delete the state-branch between different tests. Last, note the unique temporary directory -d to avoid ruining concurrent tests ^_^'

31 August 2020

Jacob Adams: Command Line 101

How to Work in a Text-Only Environment.

What is this thing? When you first open a command-line (note that I use the terms command-line and shell interchangably here, they re basically the same, but command-line is the more general term, and shell is the name for the program that executes commands for you) you ll see something like this: thisfolder This line is called a command prompt and it tells you four pieces of information:
  1. jaadams: The username of the user that is currently running this shell.
  2. bg7: The name of the computer that this shell is running on, important for when you start accessing shells on remote machines.
  3. /tmp/thisfolder: The folder or directory that your shell is currently running in. Like a file explorer (like Window s Explorer or Mac s Finder) a shell always has a working directory, from which all relative paths (see sidenote below) are resolved.
When you first opened a shell, however, you might notice that is looks more like this: home This is a shorthand notation that the shell uses to make this output shorter when possible. ~ stands for your home directory, usually /home/<username>. Like C:\Users\<username>\ on Windows or /Users/<username> on Mac, this directory is where all your files should go by default. Thus a command prompt like this: downloads actually tells you that you are currently in the /home/jaadams/Downloads directory.

Sidenote: The Unix Filesystem and Relative Paths folders on Linux and other Unix-derived systems like MacOS are usually called directories. These directories are represented by paths, strings that indicate where the directory is on the filesystem. The one unusual part is the so-called root directory . All files are stored in this folder or directories under it. Its path is just / and there are no directories above it. For example, the directory called home typically contains all user directories. This is stored in the root directory, and each users specific data is stored in a directory named after that user under home. Thus, the home directory of the user jacob is typically /home/jacob, the directory jacob under the home directory stored in the root directory /. If you re interested in more details about what goes in what directory, man hier has the basics and the Filesystem Hierarchy Standard governs the layout of the filesystem on most Linux distributions. You don t always have to use the full path, however. If the path does not begin with a /, it is assumed that the path actually begins with the path of the current directory. So if you use a path like my/folders/here, and you re in the /home/jacob directory, the path will be treated like /home/jacob/my/folders/here. Each folder also contains the symbolic links .. and .. Symbolic links are a very powerful kind of file that is actually a reference to another file. .. always represents the parent directory of the current directory, so /home/jacob/.. links to /home/. . always links to the current directory, so /home/jacob/. links to /home/jacob.

Running commands To run a command from the command prompt, you type its name and then usually some arguments to tell it what to do. For example, the echo command displays the text passed as arguments.
jacob@lovelace/home/jacob$ echo hello world
hello world
Arguments to commands are space-separated, so in the previous example hello is the first argument and world is the second. If you need an argument to contain spaces, you ll want to put quotes around it, echo "like so". Certain arguments are called flags , or options (options if they take another argument, flags otherwise) usually prefixed with a hyphen, and they change the way a program operates. For example, the ls command outputs the contents of a directory passed as an argument, but if you add -l before the directory, it will give you more details on the files in that directory.
jacob@lovelace/tmp/test$ ls /tmp/test
1  2  3  4  5  6
jacob@lovelace/tmp/test$ ls -l /tmp/test
total 0
-rw-r--r-- 1 jacob jacob 0 Aug 26 22:06 1
-rw-r--r-- 1 jacob jacob 0 Aug 26 22:06 2
-rw-r--r-- 1 jacob jacob 0 Aug 26 22:06 3
-rw-r--r-- 1 jacob jacob 0 Aug 26 22:06 4
-rw-r--r-- 1 jacob jacob 0 Aug 26 22:06 5
-rw-r--r-- 1 jacob jacob 0 Aug 26 22:06 6
jacob@lovelace/tmp/test$
Most commands take different flags to change their behavior in various ways.

File Management
  • cd <path>: Change the current directory of the running shell to <path>.
  • ls <path>: Output the contents of <path>. If no path is passed, it prints the contents of the current directory.
  • touch <filename>: create an new empty file called <filename>. Used on an existing file, it updates the file s last accessed and modified times. Most text editors can also create a new file for you, which is probably more useful.
  • mkdir <directory>: Create a new folder/directory at path <directory>.
  • mv <src> <dest>: Move a file or directory at path <src> to <dest>.
  • cp <src> <dest>: Copy a file or directory at path <src> to <dest>.
  • rm <file>: Remove a file at path <file>.
  • zip -r <zipfile> <contents...>: Create a zip file <zipfile> with contents <contents>. <contents> can be multiple arguments, and you ll usually want to use the -r argument when including directories in your zipfile, as otherwise only the directory will be included and not the files and directories within it.

Searching
  • grep <thing> <file>: Look for the string <thing> in <file>. If no <file> is passed it searches standard input.
  • find <path> -name <name>: Find a file or directory called <name> somwhere under <path>. This command is actually very powerful, but also very complex. For example you can delete all files in a directory older than 30 days with:
    find -mtime +30 -exec rm  \;
    
  • locate <name>: A much easier to use command to find a file with a given name, but it is not usually installed by default.

Outputting Files
  • cat <files...>: Output (concatenate) all the files passed as arguments.
  • head <file>: Output the beginning of <file>
  • tail <file>: Output the end of <file>

How to Find the Right Command All commands (at least on sane Linux distributions like Debian or Ubuntu) are documented with a manual page, in man section 1 (for more information on manual sections, run man intro). This can be accessed using man <command> You can search for the right command using the -k flag, as in man -k <search>. You can also view manual pages in your browser, on sites like https://manpages.debian.org or https://linux.die.net/man. This is not always helpful, however, because some command s descriptions are not particularly useful, and also there are a lot of manual pages, which can make searching for a specific one difficult. For example, finding the right command to search inside text files is quite difficult via man (it s grep). When you can t find what you need with man I recommend falling back to searching the Internet. There are lots of bad Linux tutorials out there, but here are some reputable sources I recommend:
  • https://www.cyberciti.biz: nixCraft has excellent tutorials on all things Linux
  • Hosting providers like Digital Ocean or Linode: Good intro documentation, but can sometimes be outdated
  • https://tldp.org: The Linux Documentation project is great, but it can also be a little outdated sometimes.
  • https://stackoverflow.com: Oftentimes has great answers, but quality varies wildly since anyone can answer.
These are certainly not the only options but they re the sources I would recommend when available.

How to Read a Manual Page Manual pages consist of a series of sections, each with a specific purpose. Instead of attempting to write my own description here, I m going to borrow the excellent one from The Linux Documentation Project
The NAME section is the only required section. Man pages without a name section are as useful as refrigerators at the north pole. This section also has a standardized format consisting of a comma-separated list of program or function names, followed by a dash, followed by a short (usually one line) description of the functionality the program (or function, or file) is supposed to provide. By means of makewhatis(8), the name sections make it into the whatis database files. Makewhatis is the reason the name section must exist, and why it must adhere to the format I described. (Formatting explanation cut for brevity) The SYNOPSIS section is intended to give a short overview on available program options. For functions this sections lists corresponding include files and the prototype so the programmer knows the type and number of arguments as well as the return type. The DESCRIPTION section eloquently explains why your sequence of 0s and 1s is worth anything at all. Here s where you write down all your knowledge. This is the Hall Of Fame. Win other programmers and users admiration by making this section the source of reliable and detailed information. Explain what the arguments are for, the file format, what algorithms do the dirty jobs. The OPTIONS section gives a description of how each option affects program behaviour. You knew that, didn t you? The FILES section lists files the program or function uses. For example, it lists configuration files, startup files, and files the program directly operates on. (Cut details about installing files) The ENVIRONMENT section lists all environment variables that affect your program or function and tells how, of course. Most commonly the variables will hold pathnames, filenames or default options. The DIAGNOSTICS section should give an overview of the most common error messages from your program and how to cope with them. There s no need to explain system error error messages (from perror(3)) or fatal signals (from psignal(3)) as they can appear during execution of any program. The BUGS section should ideally be non-existent. If you re brave, you can describe here the limitations, known inconveniences and features that others may regard as misfeatures. If you re not so brave, rename it the TO DO section ;-) The AUTHOR section is nice to have in case there are gross errors in the documentation or program behaviour (Bzzt!) and you want to mail a bug report. The SEE ALSO section is a list of related man pages in alphabetical order. Conventionally, it is the last section.

Remote Access One of the more powerful uses of the shell is through ssh, the secure shell. This allows you to remotely connect to another computer and run a shell on that machine:
user@host:~$ ssh other@example.com
other@example:~$
The prompt changes to reflect the change in user and host, as you can see in the example above. This allows you to work in a shell on that machine as if it was right in front of you.

Moving Files Between Machines There are several ways you can move files between machines over ssh. The first and easiest is scp, which works much like the cp command except that paths can also take a user@host argument to move files across computers. For example, if you wanted to move a file test.txt to your home directory on another machine, the command would look like:
scp test.txt other@example.com:
(The home directory is the default path) Otherwise you can move files by reversing the order of the arguments and put a path after the colon to move files from another directory on the remote host. For example, if you wanted to fetch the file /etc/issue.net from example.com:
scp other@example.com:/etc/issue.net .
Another option is the sftp command, which gives you a very simple shell-like interface in which you can cd and ls, before either puting files onto the local machine or geting files off of it. The final and most powerful option is rsync which syncs the contents of one directory to another, and doesn t copy files that haven t changed. It s powerful and complex, however, so I recommend reading the USAGE section of its man page.

Long-Running Commands The one problem with ssh is that it will stop any command running in your shell when you disconnect. If you want to leave something on and come back later then this can be a problem. This is where terminal multiplexers come in. tmux and screen both allow you to run a shell in a safe environment where it will continue even if you disconnect from it. You do this by running the command without any arguments, i.e. just tmux or just screen. In tmux you can disconnect from the current session by pressing Ctrl+b then d, and reattach with the tmux attach command. screen works similarly, but with Ctrl+a instead of b and screen -r to reattach.

Command Inputs and Outputs Arguments are not the only way to pass input to a command. They can also take input from what s called standard input , which the shell usually connects to your keyboard. Output can go to two places, standard output and standard error, both of which are directed to the screen by default.

Redirecting I/O Note that I said above that standard input/output/error are only usually connected to the keyboard and the terminal? This is because you can redirect them to other places with the shell operators <, > and the very powerful .

File redirects The operators < and > redirect the input and output of a command to a file. For example, if you wanted a file called list.txt that contained a list of all the files in a directory /this/one/here you could use:
ls /this/one/here > list.txt

Pipelines The pipe character, , allows you to direct the output of one command into the input of another. This can be very powerful. For example, the following pipeline lists the contents of the current directory searches for the string test , then counts the number of results. (wc -l counts the number of lines in its input)
ls   grep test   wc -l
For a better, but even more contrived example, say you have a file myfile, with a bunch of lines of potentially duplicated and unsorted data
test
test
1234
4567
1234
You can sort it and output only the unique lines with sort and uniq:
$ uniq < myfile   sort
1234
1234
4567
test

Save Yourself Some Typing: Globs and Tab-Completion Sometimes you don t want to type out the whole filename when writing out a command. The shell can help you here by autocompleting when you press the tab key. If you have a whole bunch of files with the same suffix, you can refer to them when writing arguments as *.suffix. This also works with prefixes, prefix*, and in fact you can put a * anywhere, *middle*. The shell will expand that * into all the files in that directory that match your criteria (ending with a specific suffix, starting with a specific prefix, and so on) and pass each file as a separate argument to the command. For example, if I have a series of files called 1.txt, 2.txt, and so on up to 9, each containing just the number for which it s named, I could use cat to output all of them like so:
jacob@lovelace/tmp/numbers$ ls
1.txt  2.txt  3.txt  4.txt  5.txt  6.txt  7.txt  8.txt	9.txt
jacob@lovelace/tmp/numbers$ cat *.txt
1
2
3
4
5
6
7
8
9
Also the ~ shorthand mentioned above that refers to your home directory can be used when passing a path as an argument to a command.

Ifs and For loops The files in the above example were generated with the following shell commands:
for i in 1 2 3 4 5 6 7 8 9
do
echo $i > $i.txt
done
But I ll have to save variables, conditionals and loops for another day because this is already too long. Needless to say the shell is a full programming language, although a very ugly and dangerous one.

24 August 2020

Arnaud Rebillout: Send emails from your terminal with msmtp

In this tutorial, we'll configure everything needed to send emails from the terminal. We'll use msmtp, a lightweight SMTP client. For the sake of the example, we'll use a GMail account, but any other email provider can do. Your OS is expected to be Debian, as usual on this blog, although it doesn't really matter. We will also see how to store the credentials for the email account in the system keyring. And finally, we'll go the extra mile, and see how to configure various command-line utilities so that they automatically use msmtp to send emails. Even better, we'll make msmtp the default email sender, to actually avoid configuring these utilities one by one. Prerequisites Strong prerequisites (if you don't recognize yourself here, you probably landed on the wrong page): Weak prerequisites (if your setup doesn't match those points exactly, that's fine, you can still read on): GMail account setup For a GMail account, there's a bit of configuration to do. For other email providers, I have no idea, maybe you can just skip this part, or maybe you will have to go through a similar procedure. If you want an external program (msmtp in this case) to talk to the GMail servers on your behalf, and send emails, you can't just use your usual GMail password. Instead, GMail requires you to generate so-called app passwords, one for each application that needs to access your GMail account. This approach has several advantages: So app passwords are a good idea, it just requires a bit of work to set it up. Let's see what it takes. First, 2-Step Verification must be enabled on your GMail account. Visit https://myaccount.google.com/security, and if that's not the case, enable it. You'll need to authorize all of your devices (computer(s), phone(s) and so on), and it can be a bit tedious, granted. But you only have to do it once in a lifetime, and after it's done, you're left with a more secure account, so it's not that bad, right? Enabling the 2-Step Verification will unlock the feature we need: App passwords. Visit https://myaccount.google.com/apppasswords, and under "Signing in to Google", click "App passwords", and generate one. An app password is a 16 characters string, something like qwertyuiopqwerty. It's supposed to be used from only one place, ie. from ONE application that is installed on ONE device. That's why it's common to give it a name of the form application@device, so in our case it could be msmtp@laptop, but really it's free form, choose whatever name suits you, as long as it makes sense to you. So let's give a name to this app password, write it down for now, and we're done with the GMail config. Send your first email Time to get started with msmtp. First thing first, installation, trivial:
sudo apt install msmtp
Let's try to send an email. At this point, we did not create any configuration file for msmtp yet, so we have to provide every details on the command line.
# Write a dummy email
cat << EOF > message.txt
From: YOUR_LOGIN@gmail.com
To: SOMEONE_ELSE@SOMEWHERE_ELSE.com
Subject: Cafe Sua Da
Iced-coffee with condensed milk
EOF
# Send it
cat message.txt   msmtp \
    --auth=on --tls=on \
    --host smtp.gmail.com \
    --port 587 \
    --user YOUR_LOGIN \
    --read-envelope-from \
    --read-recipients
# msmtp prompts you for your password:
# this is where goes the app password!
Obviously, in this example you should replace the uppercase words with the real thing, that is, your email login, and real email addresses. Also, let me insist, you must enter the app password that was generated previously, not your real GMail password. And it should work already, this email should have been sent and received by now. So let me explain quickly what happened here. In the file message.txt, we provided From: (the email address of the person sending the email) and To: (the destination email address). Then we asked msmtp to re-use those values to set the envelope of the email with --read-envelope-from and --read-recipients. What about the other parameters? For more details, you should refer to the msmtp documentation. Write a configuration file So we could send an email, that's cool already. However the command to do that was a bit long, and we don't want to juggle with all these arguments every time we send an email. So let's write down all of that into a configuration file. msmtp supports two locations: ~/.msmtprc and ~/.config/msmtp/config, at your preference. In this tutorial we'll use ~/.msmtprc for brevity:
cat << 'EOF' > ~/.msmtprc
defaults
tls on
account gmail
auth on
host smtp.gmail.com
port 587
user YOUR_LOGIN
from YOUR_LOGIN@gmail.com
account default : gmail
EOF
And for a quick explanation: All in all it's pretty simple, and it's becoming easier to send an email:
# Write a dummy email. Note that the
# header 'From:' is no longer needed,
# it's already in '~/.msmtprc'.
cat << 'EOF' > message.txt
To: SOMEONE_ELSE@SOMEWHERE_ELSE.com
Subject: Flat White
The milky way for coffee
EOF
# Send it
cat message.txt   msmtp \
    --account default \
    --read-recipients
Actually, --account default is not needed, as it's the default anyway if you don't provide a --account argument. Furthermore --read-recipients can be shortened as -t. So we can make it real short now:
msmtp -t < message.txt
At this point, life is good! Except for one thing maybe: we still have to type the password every time we send an email. Surely it must be possible to avoid that annoyance... Store your password in the system keyring For this part, we'll make use of the libsecret tool to store the password in the system keyring via the Secret Service API. It means that your desktop environment should implement the Secret Service specification, which is the case for both GNOME and KDE. Note that GNOME provides Seahorse to have a look at your secrets, KDE has the KDE Wallet. There's also KeePassXC, which I have only heard of but never used. I guess it can be your password manager of choice if you use neither GNOME nor KDE. For those running an up-to-date Debian unstable, you should have msmtp >= 1.8.11-2, and you're all good to go. For those having an older version than that however, you will have to install the package msmtp-gnome in order to have msmtp built with libsecret support. Note that this package depends on seahorse, hence it pulls in a good part of the GNOME stack when you install it. For those not running GNOME, that's unfortunate. All of this was discussed and fixed in #962689. Alright! So let's just make sure that the libsecret tools are installed:
sudo apt install libsecret-tools
And now we can store our password in the system keyring with this command:
secret-tool store --label msmtp \
    host smtp.gmail.com \
    service smtp \
    user YOUR_LOGIN
If this looks a bit too magic, and you want something more visual, you can actually fire a GUI like seahorse (for GNOME users), or kwalletmanager5 (for KDE users), and then you will see what passwords are stored in there. Here's a screenshot of Seahorse, with a msmtp password stored: seahorse with msmtp password Let's try to send an email again:
msmtp -t < message.txt
No need for a password anymore, msmtp got it from the system keyring! For more details on how msmtp handle the passwords, and to see what other methods are supported, refer to the extensive documentation. Use-cases and integration Let's go over a few use-cases, situations where you might end up sending emails from the command-line, and what configuration is required to make it work with msmtp. Git Send-Email Sending emails with git is a common workflow for some projects, like the Linux kernel. How does git send-email actually send emails? From the git-send-email manual page:
the built-in default is to search for sendmail in /usr/sbin, /usr/lib and $PATH if such program is available
It is possible to override this default though:
--smtp-server=
[...] Alternatively it can specify a full pathname of a sendmail-like program instead; the program must support the -i option.
So in order to use msmtp here, you'd add a snippet like that to your ~/.gitconfig file:
[sendemail]
    smtpserver = /usr/bin/msmtp
For a full guide, you can also refer to https://git-send-email.io. Debian developer tools Tools like bts or reportbug are also good examples of command-line tools that need to send emails. From the bts manual page:
--sendmail=SENDMAILCMD
Specify the sendmail command [...] Default is /usr/sbin/sendmail.
So if you want bts to send emails with msmtp instead of sendmail, you must use bts --sendmail='/usr/bin/msmtp -t'. Note that bts also loads settings from the file /etc/devscripts.conf and ~/.devscripts, so you could also set BTS_SENDMAIL_COMMAND='/usr/bin/msmtp -t' in one of those files. From the reportbug manual page:
--mta=MTA
Specify an alternate MTA, instead of /usr/sbin/sendmail (the default).
In order to use msmtp here, you'd write reportbug --mta=/usr/bin/msmtp. Note that reportbug reads it settings from /etc/reportbug.conf and ~/.reportbugrc, so you could as well set mta /usr/bin/msmtp in one of those files. So who is this sendmail again? By now, you probably noticed that sendmail seems to be considered the default tool for the job, the "traditional" command that has been around for ages. Rather than configuring every tool to use something else than sendmail, wouldn't it be simpler to actually replace sendmail by msmtp? Like, create a symlink that points to msmtp, something like ln -sr /usr/bin/msmtp /usr/sbin/sendmail? So that msmtp acts as a drop-in replacement for sendmail, and there's nothing else to configure? Answer is yes, kind of. Actually, the first msmtp feature that is listed on the homepage is "Sendmail compatible interface (command line options and exit codes)". Meaning that msmtp is a drop-in replacement for sendmail, that seems to be the intent. However, you should refrain from creating or modifying anything in /usr, as it's the territory of the package manager, apt. Any change in /usr might be overwritten by apt the next time you run an upgrade or install new packages. In the case of msmtp, there is actually a package named msmtp-mta that will create this symlink for you. So if you really want a definitive replacement for sendmail, there you go:
sudo apt install msmtp-mta
From this point, sendmail is now a symlink /usr/sbin/sendmail /usr/bin/msmtp, and there's no need to configure git, bts, reportbug or any other tool that would rely on sendmail. Everything should work "out of the box". Conclusion I hope that you enjoyed reading this article! If you have any comment, feel free to send me a short email, preferably from your terminal!

17 August 2020

Ian Jackson: Doctrinal obstructiveness in Free Software

Any software system has underlying design principles, and any software project has process rules. But I seem to be seeing more often, a pathological pattern where abstract and shakily-grounded broad principles, and even contrived and sophistic objections, are used to block sensible changes. Today I will go through an example in detail, before ending with a plea: PostgreSQL query planner, WITH [MATERIALIZED] optimisation fence Background history PostgreSQL has a sophisticated query planner which usually gets the right answer. For good reasons, the pgsql project has resisted providing lots of knobs to control query planning. But there are a few ways to influence the query planner, for when the programmer knows more than the planner. One of these is the use of a WITH common table expression. In pgsql versions prior to 12, the planner would first make a plan for the WITH clause; and then, it would make a plan for the second half, counting the WITH clause's likely output as a given. So WITH acts as an "optimisation fence". This was documented in the manual - not entirely clearly, but a careful reading of the docs reveals this behaviour:
The WITH query will generally be evaluated as written, without suppression of rows that the parent query might discard afterwards.
Users (authors of applications which use PostgreSQL) have been using this technique for a long time. New behaviour in PostgreSQL 12 In PostgreSQL 12 upstream were able to make the query planner more sophisticated. In particular, it is now often capable of looking "into" the WITH common table expression. Much of the time this will make things better and faster. But if WITH was being used for its side-effect as an optimisation fence, this change will break things: queries that ran very quickly in earlier versions might now run very slowly. Helpfully, pgsql 12 still has a way to specify an optimisation fence: specifying WITH ... AS MATERIALIZED in the query. So far so good. Upgrade path for existing users of WITH fence But what about the upgrade path for existing users of the WITH fence behaviour? Such users will have to update their queries to add AS MATERIALIZED. This is a small change. Having to update a query like this is part of routine software maintenance and not in itself very objectionable. However, this change cannnot be made in advance because pgsql versions prior to 12 will reject the new syntax. So the users are in a bit of a bind. The old query syntax can be unuseably slow with the new database and the new syntax is rejected by the old database. Upgrading both the database and the application, in lockstep, is a flag day upgrade, which every good sysadmin will want to avoid. A solution to this problem Colin Watson proposed a very simple solution: make the earlier PostgreSQL versions accept the new MATERIALIZED syntax. This is correct since the new syntax specifies precisely the actual behaviour of the old databases. It has no deleterious effect on any users of older pgsql versions. It makes it possible to add the new syntax to the application, before doing the database upgrade, decoupling the two upgrades. Colin Watson even provided an implementation of this proposal. The solution is rejected by upstream Unfortunately upstream did not accept this idea. You can read the whole thread yourself if you like. But in summary, the objections were (italic indicates literal quotes): I find these extremely unconvincing, even taken together. Many of them are very unattractive things to hear one's upstream saying. At best they are knee-jerk and inflexible application of very general principles. The authors of these objections seem to have lost sight of the fact that these principles have a purpose. When these kind of software principles work against their purposes, they should be revised, or exceptions made. At worst, it looks like a collective effort to find reasons - any reasons, no matter how bad - not to make this change. The OFFSET 0 trick One of the responses in the thread mentions OFFSET 0. As part of writing the queries in the Xen Project CI system, and preparing for our system upgrade, I had carefully read the relevant pgsql documentation. This OFFSET 0 trick was new to me. But, now that I know the answer, it is easy to provide the right search terms and find, for example, this answer on stackmumble. Apparently adding a no-op OFFSET 0 to the subquery defeats the pgsql 12 query planner's ability to see into the subquery.
I think OFFSET 0 is the better approach since it's more obviously a hack showing that something weird is going on, and it's unlikely we'll ever change the optimiser behaviour around OFFSET 0 ... wheras hopefully CTEs will become inlineable at some point CTEs became inlineable by default in PostgreSQL 12.
So in fact there is a syntax for an optimisation fence that is accepted by both earlier and later PostgreSQL versions. It's even recommended by pgsql devs. It's just not documented, and is described by pgsql developers as a "hack". Astonishingly, the fact that it is a "hack" is given as a reason to use it! Well, I have therefore deployed this "hack". No doubt it will stay in our codebase indefinitely. Please don't be like that! I could come up with a lot more examples of other projects that have exhibited similar arrogance. It is becoming a plague! But every example is contentious, and I don't really feel I need to annoy a dozen separate Free Software communities. So I won't make a laundry list of obstructiveness. If you are an upstream software developer, or a distributor of software to users (eg, a distro maintainer), you have a lot of practical power. In theory it is Free Software so your users could just change it themselves. But for a user or downstream, carrying a patch is often an unsustainable amount of work and risk. Most of us have patches we would love to be running, but which we haven't even written because simply running a nonstandard build is too difficult, no matter how technically excellent our delta. As an upstream, it is very easy to get into a mindset of defending your code's existing behaviour, and to turn your project's guidelines into inflexible rules. Constant exposure to users who make silly mistakes, and rudely ask for absurd changes, can lead to core project members feeling embattled. But there is no need for an upstream to feel embattled! You have the vast majority of the power over the software, and over your project communication fora. Use that power consciously, for good. I can't say that arrogance will hurt you in the short term. Users of software with obstructive upstreams do not have many good immediate options. But we do have longer-term choices: we can choose which software to use, and we can choose whether to try to help improve the software we use. After reading Colin's experience, I am less likely to try to help improve the experience of other PostgreSQL users by contributing upstream. It doesn't seem like there would be any point. Indeed, instead of helping the PostgreSQL community I am now using them as an example of bad practice. I'm only half sorry about that.

comment count unavailable comments

15 August 2020

Antonio Terceiro: Useful ffmpeg commands for editing video

For DebConf20, we are recommending that speakers pre-record the presentation part of their talks, and will have live Q&A. We had a smaller online MiniDebConf a couple of months ago, where for instance I had connectivity issues during my talk, so even though it feels too artificial, I guess pre-recording can decrease by a lot the likelihood of a given talk going bad. Paul Gevers and I submitted a short 20 min talk giving an update on autopkgtest, ci.debian.net and friends. We will provide the latest updates on autopkgtest, autodep8, debci, ci.debian.net, and its integration with the Debian testing migration software, britney. We agreed on a split of the content, each one recorded their part, and I offered to join them together. The logical chaining of the topics is such that we can't just concatenate the recordings, so we need to interlace our parts. So I set out to do a full video editing work. I have done this before, although in a simpler way, for one of the MiniDebconfs we held in Curitiba. In that case, it was just cutting the noise at the beginning and the end of the recording, and adding beginning and finish screens with sponsors logos etc. The first issue I noticed was that both our recordings had a decent amount of audio noise. To extract the audio track from the videos, I resorted to How can I extract audio from video with ffmpeg? on Stack Overflow:
ffmpeg -i input-video.avi -vn -acodec copy output-audio.aac
I then edited the audio with Audacity. I passed a noise reduction filter a couple of times, then a compressor filter to amplify my recording on mine, as Paul's already had a good volume. And those are my more advanced audio editing skills, which I acquired doing my own podcast. I now realized I could have just muted the audio tracks from the original clip and align the noise-free audio with it, but I ended up creating new video files with the clean audio. Another member of the Stack Overflow family came to the rescue, in How to merge audio and video file in ffmpeg. To replace the audio stream, we can do something like this:
ffmpeg -i video.mp4 -i audio.wav -c:v copy -c:a aac -map 0:v:0 -map 1:a:0 output.mp4
Paul's recording had a 4:3 aspect ratio, while the requested format is 16:9. This late in the game, there was zero chance I would request him to redo the recording. So I decided to add those black bars on the side to make it the right aspect when showing full screen. And yet again the quickest answer I could find came from the Stack Overflow empire: ffmpeg: pillarbox 4:3 to 16:9:
ffmpeg -i "input43.mkv" -vf "scale=640x480,setsar=1,pad=854:480:107:0" [etc..]
The final editing was done with pitivi, which is what I have used before. I'm a very basic user, but I could do what I needed. It was basically splitting the clips at the right places, inserting the slides as images and aligning them with the video, and making most our video appear small in the corner when presenting the slides. P.S.: all the command lines presented here are examples, basically copied from the linked Q&As, and have to be adapted to your actual input and output formats.

12 August 2020

Michael Stapelberg: distri: 20x faster initramfs (initrd) from scratch

In case you are not yet familiar with why an initramfs (or initrd, or initial ramdisk) is typically used when starting Linux, let me quote the wikipedia definition: [ ] initrd is a scheme for loading a temporary root file system into memory, which may be used as part of the Linux startup process [ ] to make preparations before the real root file system can be mounted. Many Linux distributions do not compile all file system drivers into the kernel, but instead load them on-demand from an initramfs, which saves memory. Another common scenario, in which an initramfs is required, is full-disk encryption: the disk must be unlocked from userspace, but since userspace is encrypted, an initramfs is used.

Motivation Thus far, building a distri disk image was quite slow: This is on an AMD Ryzen 3900X 12-core processor (2019):
distri % time make cryptimage serial=1
80.29s user 13.56s system 186% cpu 50.419 total # 19s image, 31s initrd
Of these 50 seconds, dracut s initramfs generation accounts for 31 seconds (62%)! Initramfs generation time drops to 8.7 seconds once dracut no longer needs to use the single-threaded gzip(1) , but the multi-threaded replacement pigz(1) : This brings the total time to build a distri disk image down to:
distri % time make cryptimage serial=1
76.85s user 13.23s system 327% cpu 27.509 total # 19s image, 8.7s initrd
Clearly, when you use dracut on any modern computer, you should make pigz available. dracut should fail to compile unless one explicitly opts into the known-slower gzip. For more thoughts on optional dependencies, see Optional dependencies don t work . But why does it take 8.7 seconds still? Can we go faster? The answer is Yes! I recently built a distri-specific initramfs I m calling minitrd. I wrote both big parts from scratch:
  1. the initramfs generator program (distri initrd)
  2. a custom Go userland (cmd/minitrd), running as /init in the initramfs.
minitrd generates the initramfs image in 400ms, bringing the total time down to:
distri % time make cryptimage serial=1
50.09s user 8.80s system 314% cpu 18.739 total # 18s image, 400ms initrd
(The remaining time is spent in preparing the file system, then installing and configuring the distri system, i.e. preparing a disk image you can run on real hardware.) How can minitrd be 20 times faster than dracut? dracut is mainly written in shell, with a C helper program. It drives the generation process by spawning lots of external dependencies (e.g. ldd or the dracut-install helper program). I assume that the combination of using an interpreted language (shell) that spawns lots of processes and precludes a concurrent architecture is to blame for the poor performance. minitrd is written in Go, with speed as a goal. It leverages concurrency and uses no external dependencies; everything happens within a single process (but with enough threads to saturate modern hardware). Measuring early boot time using qemu, I measured the dracut-generated initramfs taking 588ms to display the full disk encryption passphrase prompt, whereas minitrd took only 195ms. The rest of this article dives deeper into how minitrd works.

What does an initramfs do? Ultimately, the job of an initramfs is to make the root file system available and continue booting the system from there. Depending on the system setup, this involves the following 5 steps:

1. Load kernel modules to access the block devices with the root file system Depending on the system, the block devices with the root file system might already be present when the initramfs runs, or some kernel modules might need to be loaded first. On my Dell XPS 9360 laptop, the NVMe system disk is already present when the initramfs starts, whereas in qemu, we need to load the virtio_pci module, followed by the virtio_scsi module. How will our userland program know which kernel modules to load? Linux kernel modules declare patterns for their supported hardware as an alias, e.g.:
initrd# grep virtio_pci lib/modules/5.4.6/modules.alias
alias pci:v00001AF4d*sv*sd*bc*sc*i* virtio_pci
Devices in sysfs have a modalias file whose content can be matched against these declarations to identify the module to load:
initrd# cat /sys/devices/pci0000:00/*/modalias
pci:v00001AF4d00001005sv00001AF4sd00000004bc00scFFi00
pci:v00001AF4d00001004sv00001AF4sd00000008bc01sc00i00
[ ]
Hence, for the initial round of module loading, it is sufficient to locate all modalias files within sysfs and load the responsible modules. Loading a kernel module can result in new devices appearing. When that happens, the kernel sends a uevent, which the uevent consumer in userspace receives via a netlink socket. Typically, this consumer is udev(7) , but in our case, it s minitrd. For each uevent messages that comes with a MODALIAS variable, minitrd will load the relevant kernel module(s). When loading a kernel module, its dependencies need to be loaded first. Dependency information is stored in the modules.dep file in a Makefile-like syntax:
initrd# grep virtio_pci lib/modules/5.4.6/modules.dep
kernel/drivers/virtio/virtio_pci.ko: kernel/drivers/virtio/virtio_ring.ko kernel/drivers/virtio/virtio.ko
To load a module, we can open its file and then call the Linux-specific finit_module(2) system call. Some modules are expected to return an error code, e.g. ENODEV or ENOENT when some hardware device is not actually present. Side note: next to the textual versions, there are also binary versions of the modules.alias and modules.dep files. Presumably, those can be queried more quickly, but for simplicitly, I have not (yet?) implemented support in minitrd.

2. Console settings: font, keyboard layout Setting a legible font is necessary for hi-dpi displays. On my Dell XPS 9360 (3200 x 1800 QHD+ display), the following works well:
initrd# setfont latarcyrheb-sun32
Setting the user s keyboard layout is necessary for entering the LUKS full-disk encryption passphrase in their preferred keyboard layout. I use the NEO layout:
initrd# loadkeys neo

3. Block device identification In the Linux kernel, block device enumeration order is not necessarily the same on each boot. Even if it was deterministic, device order could still be changed when users modify their computer s device topology (e.g. connect a new disk to a formerly unused port). Hence, it is good style to refer to disks and their partitions with stable identifiers. This also applies to boot loader configuration, and so most distributions will set a kernel parameter such as root=UUID=1fa04de7-30a9-4183-93e9-1b0061567121. Identifying the block device or partition with the specified UUID is the initramfs s job. Depending on what the device contains, the UUID comes from a different place. For example, ext4 file systems have a UUID field in their file system superblock, whereas LUKS volumes have a UUID in their LUKS header. Canonically, probing a device to extract the UUID is done by libblkid from the util-linux package, but the logic can easily be re-implemented in other languages and changes rarely. minitrd comes with its own implementation to avoid cgo or running the blkid(8) program.

4. LUKS full-disk encryption unlocking (only on encrypted systems) Unlocking a LUKS-encrypted volume is done in userspace. The kernel handles the crypto, but reading the metadata, obtaining the passphrase (or e.g. key material from a file) and setting up the device mapper table entries are done in user space.
initrd# modprobe algif_skcipher
initrd# cryptsetup luksOpen /dev/sda4 cryptroot1
After the user entered their passphrase, the root file system can be mounted:
initrd# mount /dev/dm-0 /mnt

5. Continuing the boot process (switch_root) Now that everything is set up, we need to pass execution to the init program on the root file system with a careful sequence of chdir(2) , mount(2) , chroot(2) , chdir(2) and execve(2) system calls that is explained in this busybox switch_root comment.
initrd# mount -t devtmpfs dev /mnt/dev
initrd# exec switch_root -c /dev/console /mnt /init
To conserve RAM, the files in the temporary file system to which the initramfs archive is extracted are typically deleted.

How is an initramfs generated? An initramfs image (more accurately: archive) is a compressed cpio archive. Typically, gzip compression is used, but the kernel supports a bunch of different algorithms and distributions such as Ubuntu are switching to lz4. Generators typically prepare a temporary directory and feed it to the cpio(1) program. In minitrd, we read the files into memory and generate the cpio archive using the go-cpio package. We use the pgzip package for parallel gzip compression. The following files need to go into the cpio archive:

minitrd Go userland The minitrd binary is copied into the cpio archive as /init and will be run by the kernel after extracting the archive. Like the rest of distri, minitrd is built statically without cgo, which means it can be copied as-is into the cpio archive.

Linux kernel modules Aside from the modules.alias and modules.dep metadata files, the kernel modules themselves reside in e.g. /lib/modules/5.4.6/kernel and need to be copied into the cpio archive. Copying all modules results in a 80 MiB archive, so it is common to only copy modules that are relevant to the initramfs s features. This reduces archive size to 24 MiB. The filtering relies on hard-coded patterns and module names. For example, disk encryption related modules are all kernel modules underneath kernel/crypto, plus kernel/drivers/md/dm-crypt.ko. When generating a host-only initramfs (works on precisely the computer that generated it), some initramfs generators look at the currently loaded modules and just copy those.

Console Fonts and Keymaps The kbd package s setfont(8) and loadkeys(1) programs load console fonts and keymaps from /usr/share/consolefonts and /usr/share/keymaps, respectively. Hence, these directories need to be copied into the cpio archive. Depending on whether the initramfs should be generic (work on many computers) or host-only (works on precisely the computer/settings that generated it), the entire directories are copied, or only the required font/keymap.

cryptsetup, setfont, loadkeys These programs are (currently) required because minitrd does not implement their functionality. As they are dynamically linked, not only the programs themselves need to be copied, but also the ELF dynamic linking loader (path stored in the .interp ELF section) and any ELF library dependencies. For example, cryptsetup in distri declares the ELF interpreter /ro/glibc-amd64-2.27-3/out/lib/ld-linux-x86-64.so.2 and declares dependencies on shared libraries libcryptsetup.so.12, libblkid.so.1 and others. Luckily, in distri, packages contain a lib subdirectory containing symbolic links to the resolved shared library paths (hermetic packaging), so it is sufficient to mirror the lib directory into the cpio archive, recursing into shared library dependencies of shared libraries. cryptsetup also requires the GCC runtime library libgcc_s.so.1 to be present at runtime, and will abort with an error message about not being able to call pthread_cancel(3) if it is unavailable.

time zone data To print log messages in the correct time zone, we copy /etc/localtime from the host into the cpio archive.

minitrd outside of distri? I currently have no desire to make minitrd available outside of distri. While the technical challenges (such as extending the generator to not rely on distri s hermetic packages) are surmountable, I don t want to support people s initramfs remotely. Also, I think that people s efforts should in general be spent on rallying behind dracut and making it work faster, thereby benefiting all Linux distributions that use dracut (increasingly more). With minitrd, I have demonstrated that significant speed-ups are achievable.

Conclusion It was interesting to dive into how an initramfs really works. I had been working with the concept for many years, from small tasks such as debug why the encrypted root file system is not unlocked to more complicated tasks such as set up a root file system on DRBD for a high-availability setup . But even with that sort of experience, I didn t know all the details, until I was forced to implement every little thing. As I suspected going into this exercise, dracut is much slower than it needs to be. Re-implementing its generation stage in a modern language instead of shell helps a lot. Of course, my minitrd does a bit less than dracut, but not drastically so. The overall architecture is the same. I hope my effort helps with two things:
  1. As a teaching implementation: instead of wading through the various components that make up a modern initramfs (udev, systemd, various shell scripts, ), people can learn about how an initramfs works in a single place.
  2. I hope the significant time difference motivates people to improve dracut.

Appendix: qemu development environment Before writing any Go code, I did some manual prototyping. Learning how other people prototype is often immensely useful to me, so I m sharing my notes here. First, I copied all kernel modules and a statically built busybox binary:
% mkdir -p lib/modules/5.4.6
% cp -Lr /ro/lib/modules/5.4.6/* lib/modules/5.4.6/
% cp ~/busybox-1.22.0-amd64/busybox sh
To generate an initramfs from the current directory, I used:
% find .   cpio -o -H newc   pigz > /tmp/initrd
In distri s Makefile, I append these flags to the QEMU invocation:
-kernel /tmp/kernel \
-initrd /tmp/initrd \
-append "root=/dev/mapper/cryptroot1 rdinit=/sh ro console=ttyS0,115200 rd.luks=1 rd.luks.uuid=63051f8a-54b9-4996-b94f-3cf105af2900 rd.luks.name=63051f8a-54b9-4996-b94f-3cf105af2900=cryptroot1 rd.vconsole.keymap=neo rd.vconsole.font=latarcyrheb-sun32 init=/init systemd.setenv=PATH=/bin rw vga=836"
The vga= mode parameter is required for loading font latarcyrheb-sun32. Once in the busybox shell, I manually prepared the required mount points and kernel modules:
ln -s sh mount
ln -s sh lsmod
mkdir /proc /sys /run /mnt
mount -t proc proc /proc
mount -t sysfs sys /sys
mount -t devtmpfs dev /dev
modprobe virtio_pci
modprobe virtio_scsi
As a next step, I copied cryptsetup and dependencies into the initramfs directory:
% for f in /ro/cryptsetup-amd64-2.0.4-6/lib/*; do full=$(readlink -f $f); rel=$(echo $full   sed 's,^/,,g'); mkdir -p $(dirname $rel); install $full $rel; done
% ln -s ld-2.27.so ro/glibc-amd64-2.27-3/out/lib/ld-linux-x86-64.so.2
% cp /ro/glibc-amd64-2.27-3/out/lib/ld-2.27.so ro/glibc-amd64-2.27-3/out/lib/ld-2.27.so
% cp -r /ro/cryptsetup-amd64-2.0.4-6/lib ro/cryptsetup-amd64-2.0.4-6/
% mkdir -p ro/gcc-libs-amd64-8.2.0-3/out/lib64/
% cp /ro/gcc-libs-amd64-8.2.0-3/out/lib64/libgcc_s.so.1 ro/gcc-libs-amd64-8.2.0-3/out/lib64/libgcc_s.so.1
% ln -s /ro/gcc-libs-amd64-8.2.0-3/out/lib64/libgcc_s.so.1 ro/cryptsetup-amd64-2.0.4-6/lib
% cp -r /ro/lvm2-amd64-2.03.00-6/lib ro/lvm2-amd64-2.03.00-6/
In busybox, I used the following commands to unlock the root file system:
modprobe algif_skcipher
./cryptsetup luksOpen /dev/sda4 cryptroot1
mount /dev/dm-0 /mnt

16 July 2020

Enrico Zini: Build Qt5 cross-builder with raspbian sysroot: compiling with the sysroot

Whack-A-Mole machines from <https://commons.wikimedia.org/wiki/File:Whac-A-Mole_Cedar_Point.jpg> This is part of a series of posts on compiling a custom version of Qt5 in order to develop for both amd64 and a Raspberry Pi. Now that I have a sysroot, I try to use it to build Qt5 with QtWebEngine. Nothing seems to work straightforwardly with Qt5's build system, and hit an endless series of significant blockers to try and work around.
Problem in wayland code QtWayland's source currently does not compile:
../../../hardwareintegration/client/brcm-egl/qwaylandbrcmeglwindow.cpp: In constructor  QtWaylandClient::QWaylandBrcmEglWindow::QWaylandBrcmEglWindow(QWindow*) :
../../../hardwareintegration/client/brcm-egl/qwaylandbrcmeglwindow.cpp:131:67: error: no matching function for call to  QtWaylandClient::QWaylandWindow::QWaylandWindow(QWindow*&) 
     , m_eventQueue(wl_display_create_queue(mDisplay->wl_display()))
                                                                   ^
In file included from ../../../../include/QtWaylandClient/5.15.0/QtWaylandClient/private/qwaylandwindow_p.h:1,
                 from ../../../hardwareintegration/client/brcm-egl/qwaylandbrcmeglwindow.h:43,
                 from ../../../hardwareintegration/client/brcm-egl/qwaylandbrcmeglwindow.cpp:40:
../../../../include/QtWaylandClient/5.15.0/QtWaylandClient/private/../../../../../src/client/qwaylandwindow_p.h:97:5: note: candidate:  QtWaylandClient::QWaylandWindow::QWaylandWindow(QWindow*, QtWayland
Client::QWaylandDisplay*) 
     QWaylandWindow(QWindow *window, QWaylandDisplay *display);
     ^~~~~~~~~~~~~~
../../../../include/QtWaylandClient/5.15.0/QtWaylandClient/private/../../../../../src/client/qwaylandwindow_p.h:97:5: note:   candidate expects 2 arguments, 1 provided
make[5]: Leaving directory '/home/build/armhf/qt-everywhere-src-5.15.0/qttools/src/qdoc'
I am not trying to debug here. I understand that Wayland support is not a requirement, and I'm adding -skip wayland to Qt5's configure options. Next round. nss not found Qt5 embeds Chrome's sources. Chrome's sources require libnss3-dev to be available for both host and target architectures. Although I now have it installed both on the build system and in the sysroot, the pkg-config wrapper that Qt5 hooks into its Chrome's sources, failes to find it:
Command: /usr/bin/python2 /home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/src/3rdparty/chromium/build/config/linux/pkg-config.py -s /home/build/sysroot/ -a arm -p /usr/bin/arm-linux-gnueabihf-pkg-config --system_libdir lib nss -v -lssl3
Returned 1.
stderr:
Package nss was not found in the pkg-config search path.
Perhaps you should add the directory containing  nss.pc'
to the PKG_CONFIG_PATH environment variable
No package 'nss' found
Traceback (most recent call last):
  File "/home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/src/3rdparty/chromium/build/config/linux/pkg-config.py", line 248, in <module>
    sys.exit(main())
  File "/home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/src/3rdparty/chromium/build/config/linux/pkg-config.py", line 143, in main
    prefix = GetPkgConfigPrefixToStrip(options, args)
  File "/home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/src/3rdparty/chromium/build/config/linux/pkg-config.py", line 82, in GetPkgConfigPrefixToStrip
    "--variable=prefix"] + args, env=os.environ).decode('utf-8')
  File "/usr/lib/python2.7/subprocess.py", line 223, in check_output
    raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command '['/usr/bin/arm-linux-gnueabihf-pkg-config', '--variable=prefix', 'nss']' returned non-zero exit status 1
See //build/config/linux/nss/BUILD.gn:15:3: whence it was called.
  pkg_config("system_nss_no_ssl_config")  
  ^---------------------------------------
See //crypto/BUILD.gn:218:25: which caused the file to be included.
    public_configs += [ "//build/config/linux/nss:system_nss_no_ssl_config" ]
                        ^--------------------------------------------------
Project ERROR: GN run error!
It's trying to look into $SYSROOT/usr/lib/pkgconfig, while it should be $SYSROOT//usr/lib/arm-linux-gnueabihf/pkgconfig. I worked around this this patch to qtwebengine/src/3rdparty/chromium/build/config/linux/pkg-config.py:
--- pkg-config.py.orig  2020-07-16 11:46:21.005373002 +0200
+++ pkg-config.py   2020-07-16 11:46:02.605296967 +0200
@@ -61,6 +61,7 @@
   libdir = sysroot + '/usr/' + options.system_libdir + '/pkgconfig'
   libdir += ':' + sysroot + '/usr/share/pkgconfig'
+  libdir += ':' + sysroot + '/usr/lib/arm-linux-gnueabihf/pkgconfig'
   os.environ['PKG_CONFIG_LIBDIR'] = libdir
   return libdir
Next round. g++ 8.3.0 Internal Compiler Error Qt5's sources embed Chrome's sources that embed the skia library sources. One of the skia library sources, when cross-compiled to ARM with -O1 or -O2 with g++ 8.3.0, produces an Internal Compiler Error:
/usr/bin/arm-linux-gnueabihf-g++ -MMD -MF obj/skia/skcms/skcms.o.d -DUSE_UDEV -DUSE_AURA=1 -DUSE_NSS_CERTS=1 -DUSE_OZONE=1 -DOFFICIAL_BUILD -DTOOLKIT_QT -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -DNO_UNWIND_TABLES -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -DCR_SYSROOT_HASH=76e6068f9f6954e2ab1ff98ce5fa236d3d85bcbd -DNDEBUG -DNVALGRIND -DDYNAMIC_ANNOTATIONS_ENABLED=0 -I../../3rdparty/chromium/third_party/skia/include/third_party/skcms -Igen -I../../3rdparty/chromium -w -std=c11 -mfp16-format=ieee -fno-strict-aliasing --param=ssp-buffer-size=4 -fstack-protector -fno-unwind-tables -fno-asynchronous-unwind-tables -fPIC -pipe -pthread -march=armv7-a -mfloat-abi=hard -mtune=generic-armv7-a -mfpu=vfpv3-d16 -mthumb -Wall -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -Wno-psabi -Wno-unused-local-typedefs -Wno-maybe-uninitialized -Wno-deprecated-declarations -fno-delete-null-pointer-checks -Wno-comments -Wno-packed-not-aligned -Wno-dangling-else -Wno-missing-field-initializers -Wno-unused-parameter -O2 -fno-ident -fdata-sections -ffunction-sections -fno-omit-frame-pointer -g0 -fvisibility=hidden -std=gnu++14 -Wno-narrowing -Wno-class-memaccess -Wno-attributes -Wno-class-memaccess -Wno-subobject-linkage -Wno-invalid-offsetof -Wno-return-type -Wno-deprecated-copy -fno-exceptions -fno-rtti --sysroot=../../../../../../sysroot/ -fvisibility-inlines-hidden -c ../../3rdparty/chromium/third_party/skia/third_party/skcms/skcms.cc -o obj/skia/skcms/skcms.o
during RTL pass: expand
In file included from ../../3rdparty/chromium/third_party/skia/third_party/skcms/skcms.cc:2053:
../../3rdparty/chromium/third_party/skia/third_party/skcms/src/Transform_inl.h: In function  void baseline::exec_ops(const Op*, const void**, const char*, char*, int) :
../../3rdparty/chromium/third_party/skia/third_party/skcms/src/Transform_inl.h:766:13: internal compiler error: in convert_move, at expr.c:218
 static void exec_ops(const Op* ops, const void** args,
             ^~~~~~~~
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-8/README.Bugs> for instructions.
I reported the bug at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96206 Since this source compiles with -O0, I attempted to fix this by editing qtwebkit/src/3rdparty/chromium/build/config/compiler/BUILD.gn and replacing instances of -O1 and -O2 with -O0. Spoiler: wrong attempt. We'll see it in the next round. Impossible constraint in asm Qt5's sources embed Chrome's sources that embed the ffmpeg library sources. Even if ffmpeg's development libraries are present both in the host and in the target system, the build system insists in compiling and using the bundled version. Unfortunately, using -O0 breaks the build of ffmpeg:
/usr/bin/arm-linux-gnueabihf-gcc -MMD -MF obj/third_party/ffmpeg/ffmpeg_internal/opus.o.d -DHAVE_AV_CONFIG_H -D_POSIX_C_SOURCE=200112 -D_XOPEN_SOURCE=600 -DPIC -DFFMPEG_CONFIGURATION=NULL -DCHROMIUM_NO_LOGGING -D_ISOC99_SOURCE -D_LARGEFILE_SOURCE -DUSE_UDEV -DUSE_AURA=1 -DUSE_NSS_CERTS=1 -DUSE_OZONE=1 -DOFFICIAL_BUILD -DTOOLKIT_QT -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -DNO_UNWIND_TABLES -DCR_SYSROOT_HASH=76e6068f9f6954e2ab1ff98ce5fa236d3d85bcbd -DNDEBUG -DNVALGRIND -DDYNAMIC_ANNOTATIONS_ENABLED=0 -DOPUS_FIXED_POINT -I../../3rdparty/chromium/third_party/ffmpeg/chromium/config/Chromium/linux/arm -I../../3rdparty/chromium/third_party/ffmpeg -I../../3rdparty/chromium/third_party/ffmpeg/compat/atomics/gcc -Igen -I../../3rdparty/chromium -I../../3rdparty/chromium/third_party/opus/src/include -fPIC -Wno-deprecated-declarations -fomit-frame-pointer -w -std=c99 -pthread -fno-math-errno -fno-signed-zeros -fno-tree-vectorize -fno-strict-aliasing --param=ssp-buffer-size=4 -fstack-protector -fno-unwind-tables -fno-asynchronous-unwind-tables -fPIC -pipe -pthread -march=armv7-a -mfloat-abi=hard -mtune=generic-armv7-a -mfpu=vfpv3-d16 -mthumb -g0 -fvisibility=hidden -Wno-psabi -Wno-unused-local-typedefs -Wno-maybe-uninitialized -Wno-deprecated-declarations -fno-delete-null-pointer-checks -Wno-comments -Wno-packed-not-aligned -Wno-dangling-else -Wno-missing-field-initializers -Wno-unused-parameter -O0 -fno-ident -fdata-sections -ffunction-sections -std=gnu11 --sysroot=../../../../../../sysroot/ -c ../../3rdparty/chromium/third_party/ffmpeg/libavcodec/opus.c -o obj/third_party/ffmpeg/ffmpeg_internal/opus.o
In file included from ../../3rdparty/chromium/third_party/ffmpeg/libavutil/intmath.h:30,
                 from ../../3rdparty/chromium/third_party/ffmpeg/libavutil/common.h:106,
                 from ../../3rdparty/chromium/third_party/ffmpeg/libavutil/avutil.h:296,
                 from ../../3rdparty/chromium/third_party/ffmpeg/libavutil/audio_fifo.h:30,
                 from ../../3rdparty/chromium/third_party/ffmpeg/libavcodec/opus.h:28,
                 from ../../3rdparty/chromium/third_party/ffmpeg/libavcodec/opus_celt.h:29,
                 from ../../3rdparty/chromium/third_party/ffmpeg/libavcodec/opus.c:32:
../../3rdparty/chromium/third_party/ffmpeg/libavcodec/opus.c: In function  ff_celt_quant_bands :
../../3rdparty/chromium/third_party/ffmpeg/libavutil/arm/intmath.h:77:5: error: impossible constraint in  asm 
     __asm__ ("usat %0, %2, %1" : "=r"(x) : "r"(a), "i"(p));
     ^~~~~~~
The same source compiles with using -O2 instead of -O0. I worked around this by undoing the previous change, and limiting -O0 to just the source that causes the Internal Compiler Error. I edited qtwebengine/src/3rdparty/chromium/third_party/skia/third_party/skcms/skcms.cc to prepend:
#pragma GCC push_options
#pragma GCC optimize ("O0")
and append:
#pragma GCC pop_options
Next round. Missing build-deps for i386 code Qt5's sources embed Chrome's sources that embed the V8 library sources. For some reason, torque, that is part of V8, wants to build some of its sources into 32 bit code with -m32, and I did not have i386 cross-compilation libraries installed:
/usr/bin/g++ -MMD -MF v8_snapshot/obj/v8/torque_base/csa-generator.o.d -DUSE_UDEV -DUSE_AURA=1 -DUSE_NSS_CERTS=1 -DUSE_OZONE=1 -DOFFICIAL_BUILD -DTOOLKIT_QT -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -DNO_UNWIND_TABLES -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -DNDEBUG -DNVALGRIND -DDYNAMIC_ANNOTATIONS_ENABLED=0 -DV8_TYPED_ARRAY_MAX_SIZE_IN_HEAP=64 -DENABLE_MINOR_MC -DV8_INTL_SUPPORT -DV8_CONCURRENT_MARKING -DV8_ENABLE_LAZY_SOURCE_POSITIONS -DV8_EMBEDDED_BUILTINS -DV8_SHARED_RO_HEAP -DV8_WIN64_UNWINDING_INFO -DV8_ENABLE_REGEXP_INTERPRETER_THREADED_DISPATCH -DV8_31BIT_SMIS_ON_64BIT_ARCH -DV8_DEPRECATION_WARNINGS -DV8_TARGET_ARCH_ARM -DCAN_USE_ARMV7_INSTRUCTIONS -DCAN_USE_VFP3_INSTRUCTIONS -DUSE_EABI_HARDFLOAT=1 -DV8_HAVE_TARGET_OS -DV8_TARGET_OS_LINUX -DDISABLE_UNTRUSTED_CODE_MITIGATIONS -DV8_31BIT_SMIS_ON_64BIT_ARCH -DV8_DEPRECATION_WARNINGS -Iv8_snapshot/gen -I../../3rdparty/chromium -I../../3rdparty/chromium/v8 -Iv8_snapshot/gen/v8 -fno-strict-aliasing --param=ssp-buffer-size=4 -fstack-protector -fno-unwind-tables -fno-asynchronous-unwind-tables -fPIC -pipe -pthread -m32 -msse2 -mfpmath=sse -mmmx -Wall -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -Wno-unused-local-typedefs -Wno-maybe-uninitialized -Wno-deprecated-declarations -fno-delete-null-pointer-checks -Wno-comments -Wno-packed-not-aligned -Wno-dangling-else -Wno-missing-field-initializers -Wno-unused-parameter -fno-omit-frame-pointer -g0 -fvisibility=hidden -Wno-strict-overflow -Wno-return-type -O3 -fno-ident -fdata-sections -ffunction-sections -std=gnu++14 -Wno-narrowing -Wno-class-memaccess -Wno-attributes -Wno-class-memaccess -Wno-subobject-linkage -Wno-invalid-offsetof -Wno-return-type -Wno-deprecated-copy -fvisibility-inlines-hidden -fexceptions -frtti -c ../../3rdparty/chromium/v8/src/torque/csa-generator.cc -o v8_snapshot/obj/v8/torque_base/csa-generator.o
In file included from ../../3rdparty/chromium/v8/src/torque/csa-generator.h:8,
                 from ../../3rdparty/chromium/v8/src/torque/csa-generator.cc:5:
/usr/include/c++/8/iostream:38:10: fatal error: bits/c++config.h: No such file or directory
 #include <bits/c++config.h>
          ^~~~~~~~~~~~~~~~~~
compilation terminated.
New build dependencies needed:
apt install lib32stdc++-8-dev
apt install libc6-dev-i386
dpkg --add-architecture i386
apt install linux-libc-dev:i386
Next round. OpenGL build issues Next bump are OpenGL related compiler issues:
/usr/bin/arm-linux-gnueabihf-g++ -MMD -MF obj/QtWebEngineCore/gl_ozone_glx_qt.o.d -DCHROMIUM_VERSION=\"80.0.3987.163\" -DUSE_UDEV -DUSE_AURA=1 -DUSE_NSS_CERTS=1 -DUSE_OZONE=1 -DOFFICIAL_BUILD -DTOOLKIT_QT -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -DNO_UNWIND_TABLES -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -DCR_SYSROOT_HASH=76e6068f9f6954e2ab1ff98ce5fa236d3d85bcbd -DNDEBUG -DNVALGRIND -DDYNAMIC_ANNOTATIONS_ENABLED=0 -DQT_NO_LINKED_LIST -DQT_NO_KEYWORDS -DQT_USE_QSTRINGBUILDER -DQ_FORWARD_DECLARE_OBJC_CLASS=QT_FORWARD_DECLARE_CLASS -DQTWEBENGINECORE_VERSION_STR=\"5.15.0\" -DQTWEBENGINEPROCESS_NAME=\"QtWebEngineProcess\" -DBUILDING_CHROMIUM -DQTWEBENGINE_EMBEDDED_SWITCHES -DQT_NO_EXCEPTIONS -D_LARGEFILE64_SOURCE -D_LARGEFILE_SOURCE -DQT_NO_DEBUG -DQT_QUICK_LIB -DQT_GUI_LIB -DQT_QMLMODELS_LIB -DQT_WEBCHANNEL_LIB -DQT_QML_LIB -DQT_NETWORK_LIB -DQT_POSITIONING_LIB -DQT_CORE_LIB -DQT_WEBENGINECOREHEADERS_LIB -DVK_NO_PROTOTYPES -DGL_GLEXT_PROTOTYPES -DUSE_GLX -DUSE_EGL -DGOOGLE_PROTOBUF_NO_RTTI -DGOOGLE_PROTOBUF_NO_STATIC_INITIALIZER -DHAVE_PTHREAD -DU_USING_ICU_NAMESPACE=0 -DU_ENABLE_DYLOAD=0 -DUSE_CHROMIUM_ICU=1 -DU_STATIC_IMPLEMENTATION -DICU_UTIL_DATA_IMPL=ICU_UTIL_DATA_FILE -DUCHAR_TYPE=uint16_t -DWEBRTC_NON_STATIC_TRACE_EVENT_HANDLERS=0 -DWEBRTC_CHROMIUM_BUILD -DWEBRTC_POSIX -DWEBRTC_LINUX -DABSL_ALLOCATOR_NOTHROW=1 -DWEBRTC_USE_BUILTIN_ISAC_FIX=1 -DWEBRTC_USE_BUILTIN_ISAC_FLOAT=0 -DHAVE_SCTP -DNO_MAIN_THREAD_WRAPPING -DSK_HAS_PNG_LIBRARY -DSK_HAS_WEBP_LIBRARY -DSK_USER_CONFIG_HEADER=\"../../skia/config/SkUserConfig.h\" -DSK_GL -DSK_HAS_JPEG_LIBRARY -DSK_USE_LIBGIFCODEC -DSK_VULKAN_HEADER=\"../../skia/config/SkVulkanConfig.h\" -DSK_VULKAN=1 -DSK_SUPPORT_GPU=1 -DSK_GPU_WORKAROUNDS_HEADER=\"gpu/config/gpu_driver_bug_workaround_autogen.h\" -DVK_NO_PROTOTYPES -DLEVELDB_PLATFORM_CHROMIUM=1 -DLEVELDB_PLATFORM_CHROMIUM=1 -DV8_31BIT_SMIS_ON_64BIT_ARCH -DV8_DEPRECATION_WARNINGS -I../../3rdparty/chromium/skia/config -I../../3rdparty/chromium/third_party -I../../3rdparty/chromium/third_party/boringssl/src/include -I../../3rdparty/chromium/third_party/skia/include/core -Igen -I../../3rdparty/chromium -I/home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/src/core -I/home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/src/core/api -I/home/build/armhf/qt-everywhere-src-5.15.0/qtdeclarative/include/QtQuick/5.15.0 -I/home/build/armhf/qt-everywhere-src-5.15.0/qtdeclarative/include/QtQuick/5.15.0/QtQuick -I/home/build/armhf/qt-everywhere-src-5.15.0/qtbase/include/QtGui/5.15.0 -I/home/build/armhf/qt-everywhere-src-5.15.0/qtbase/include/QtGui/5.15.0/QtGui -I/home/build/armhf/qt-everywhere-src-5.15.0/qtdeclarative/include -I/home/build/armhf/qt-everywhere-src-5.15.0/qtdeclarative/include/QtQuick -I/home/build/armhf/qt-everywhere-src-5.15.0/qtbase/include -I/home/build/armhf/qt-everywhere-src-5.15.0/qtbase/include/QtGui -I/home/build/armhf/qt-everywhere-src-5.15.0/qtdeclarative/include/QtQmlModels/5.15.0 -I/home/build/armhf/qt-everywhere-src-5.15.0/qtdeclarative/include/QtQmlModels/5.15.0/QtQmlModels -I/home/build/armhf/qt-everywhere-src-5.15.0/qtdeclarative/include/QtQml/5.15.0 -I/home/build/armhf/qt-everywhere-src-5.15.0/qtdeclarative/include/QtQml/5.15.0/QtQml -I/home/build/armhf/qt-everywhere-src-5.15.0/qtbase/include/QtCore/5.15.0 -I/home/build/armhf/qt-everywhere-src-5.15.0/qtbase/include/QtCore/5.15.0/QtCore -I/home/build/armhf/qt-everywhere-src-5.15.0/qtdeclarative/include/QtQmlModels -I/home/build/armhf/qt-everywhere-src-5.15.0/qtwebchannel/include -I/home/build/armhf/qt-everywhere-src-5.15.0/qtwebchannel/include/QtWebChannel -I/home/build/armhf/qt-everywhere-src-5.15.0/qtdeclarative/include/QtQml -I/home/build/armhf/qt-everywhere-src-5.15.0/qtbase/include/QtNetwork -I/home/build/armhf/qt-everywhere-src-5.15.0/qtlocation/include -I/home/build/armhf/qt-everywhere-src-5.15.0/qtlocation/include/QtPositioning -I/home/build/armhf/qt-everywhere-src-5.15.0/qtbase/include/QtCore -I/home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/include -I/home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/include/QtWebEngineCore -I/home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/include/QtWebEngineCore/5.15.0 -I/home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/include/QtWebEngineCore/5.15.0/QtWebEngineCore -I.moc -I/home/build/sysroot/opt/vc/include -I/home/build/sysroot/opt/vc/include/interface/vcos/pthreads -I/home/build/sysroot/opt/vc/include/interface/vmcs_host/linux -Igen/.moc -I/home/build/armhf/qt-everywhere-src-5.15.0/qtbase/mkspecs/devices/linux-rasp-pi2-g++ -Igen -Igen -I../../3rdparty/chromium/third_party/libyuv/include -Igen -I../../3rdparty/chromium/third_party/jsoncpp/source/include -I../../3rdparty/chromium/third_party/jsoncpp/generated -Igen -Igen -I../../3rdparty/chromium/third_party/khronos -I../../3rdparty/chromium/gpu -I../../3rdparty/chromium/third_party/vulkan/include -I../../3rdparty/chromium/third_party/perfetto/include -Igen/third_party/perfetto/build_config -Igen -Igen -Igen/third_party/dawn/src/include -I../../3rdparty/chromium/third_party/dawn/src/include -Igen -I../../3rdparty/chromium/third_party/boringssl/src/include -I../../3rdparty/chromium/third_party/protobuf/src -Igen/protoc_out -I../../3rdparty/chromium/third_party/protobuf/src -I../../3rdparty/chromium/third_party/ced/src -I../../3rdparty/chromium/third_party/icu/source/common -I../../3rdparty/chromium/third_party/icu/source/i18n -I../../3rdparty/chromium/third_party/webrtc_overrides -I../../3rdparty/chromium/third_party/webrtc -Igen/third_party/webrtc -I../../3rdparty/chromium/third_party/abseil-cpp -I../../3rdparty/chromium/third_party/skia -I../../3rdparty/chromium/third_party/libgifcodec -I../../3rdparty/chromium/third_party/vulkan/include -I../../3rdparty/chromium/third_party/skia/third_party/vulkanmemoryallocator -I../../3rdparty/chromium/third_party/vulkan/include -Igen/third_party/perfetto -Igen/third_party/perfetto -Igen/third_party/perfetto -Igen/third_party/perfetto -Igen/third_party/perfetto -Igen/third_party/perfetto -Igen/third_party/perfetto -Igen/third_party/perfetto -Igen/third_party/perfetto -Igen/third_party/perfetto -I../../3rdparty/chromium/third_party/crashpad/crashpad -I../../3rdparty/chromium/third_party/crashpad/crashpad/compat/non_mac -I../../3rdparty/chromium/third_party/crashpad/crashpad/compat/linux -I../../3rdparty/chromium/third_party/crashpad/crashpad/compat/non_win -I../../3rdparty/chromium/third_party/libwebm/source -I../../3rdparty/chromium/third_party/leveldatabase -I../../3rdparty/chromium/third_party/leveldatabase/src -I../../3rdparty/chromium/third_party/leveldatabase/src/include -I../../3rdparty/chromium/v8/include -Igen/v8/include -I../../3rdparty/chromium/third_party/mesa_headers -fno-strict-aliasing --param=ssp-buffer-size=4 -fstack-protector -fno-unwind-tables -fno-asynchronous-unwind-tables -fPIC -pipe -pthread -march=armv7-a -mfloat-abi=hard -mtune=generic-armv7-a -mfpu=vfpv3-d16 -mthumb -Wall -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -Wno-psabi -Wno-unused-local-typedefs -Wno-maybe-uninitialized -Wno-deprecated-declarations -fno-delete-null-pointer-checks -Wno-comments -Wno-packed-not-aligned -Wno-dangling-else -Wno-missing-field-initializers -Wno-unused-parameter -O2 -fno-ident -fdata-sections -ffunction-sections -fno-omit-frame-pointer -g0 -fvisibility=hidden -g -O2 -fdebug-prefix-map=/home/build/armhf/qt-everywhere-src-5.15.0=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -O2 -fno-exceptions -Wall -Wextra -D_REENTRANT -I/home/build/sysroot/usr/include/nss -I/home/build/sysroot/usr/include/nspr -std=gnu++14 -Wno-narrowing -Wno-class-memaccess -Wno-attributes -Wno-class-memaccess -Wno-subobject-linkage -Wno-invalid-offsetof -Wno-return-type -Wno-deprecated-copy -fno-exceptions -fno-rtti --sysroot=../../../../../../sysroot/ -fvisibility-inlines-hidden -g -O2 -fdebug-prefix-map=/home/build/armhf/qt-everywhere-src-5.15.0=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -O2 -std=gnu++1y -fno-exceptions -Wall -Wextra -D_REENTRANT -Wno-unused-parameter -Wno-unused-variable -Wno-deprecated-declarations -c /home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/src/core/ozone/gl_ozone_glx_qt.cpp -o obj/QtWebEngineCore/gl_ozone_glx_qt.o
In file included from ../../3rdparty/chromium/ui/gl/gl_bindings.h:497,
                 from ../../3rdparty/chromium/ui/gl/gl_gl_api_implementation.h:12,
                 from /home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/src/core/ozone/gl_ozone_glx_qt.cpp:49:
../../3rdparty/chromium/ui/gl/gl_bindings_autogen_egl.h:227:5: error:  EGLSetBlobFuncANDROID  has not been declared
     EGLSetBlobFuncANDROID set,
     ^~~~~~~~~~~~~~~~~~~~~
../../3rdparty/chromium/ui/gl/gl_bindings_autogen_egl.h:228:5: error:  EGLGetBlobFuncANDROID  has not been declared
     EGLGetBlobFuncANDROID get);
     ^~~~~~~~~~~~~~~~~~~~~
../../3rdparty/chromium/ui/gl/gl_bindings_autogen_egl.h:571:46: error:  EGLSetBlobFuncANDROID  has not been declared
                                              EGLSetBlobFuncANDROID set,
                                              ^~~~~~~~~~~~~~~~~~~~~
../../3rdparty/chromium/ui/gl/gl_bindings_autogen_egl.h:572:46: error:  EGLGetBlobFuncANDROID  has not been declared
                                              EGLGetBlobFuncANDROID get) = 0;
                                              ^~~~~~~~~~~~~~~~~~~~~
cc1plus: warning: unrecognized command line option  -Wno-deprecated-copy 
/usr/bin/arm-linux-gnueabihf-g++ -MMD -MF obj/QtWebEngineCore/display_gl_output_surface.o.d -DCHROMIUM_VERSION=\"80.0.3987.163\" -DUSE_UDEV -DUSE_AURA=1 -DUSE_NSS_CERTS=1 -DUSE_OZONE=1 -DOFFICIAL_BUILD -DTOOLKIT_QT -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -DNO_UNWIND_TABLES -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -DCR_SYSROOT_HASH=76e6068f9f6954e2ab1ff98ce5fa236d3d85bcbd -DNDEBUG -DNVALGRIND -DDYNAMIC_ANNOTATIONS_ENABLED=0 -DQT_NO_LINKED_LIST -DQT_NO_KEYWORDS -DQT_USE_QSTRINGBUILDER -DQ_FORWARD_DECLARE_OBJC_CLASS=QT_FORWARD_DECLARE_CLASS -DQTWEBENGINECORE_VERSION_STR=\"5.15.0\" -DQTWEBENGINEPROCESS_NAME=\"QtWebEngineProcess\" -DBUILDING_CHROMIUM -DQTWEBENGINE_EMBEDDED_SWITCHES -DQT_NO_EXCEPTIONS -D_LARGEFILE64_SOURCE -D_LARGEFILE_SOURCE -DQT_NO_DEBUG -DQT_QUICK_LIB -DQT_GUI_LIB -DQT_QMLMODELS_LIB -DQT_WEBCHANNEL_LIB -DQT_QML_LIB -DQT_NETWORK_LIB -DQT_POSITIONING_LIB -DQT_CORE_LIB -DQT_WEBENGINECOREHEADERS_LIB -DVK_NO_PROTOTYPES -DGL_GLEXT_PROTOTYPES -DUSE_GLX -DUSE_EGL -DGOOGLE_PROTOBUF_NO_RTTI -DGOOGLE_PROTOBUF_NO_STATIC_INITIALIZER -DHAVE_PTHREAD -DU_USING_ICU_NAMESPACE=0 -DU_ENABLE_DYLOAD=0 -DUSE_CHROMIUM_ICU=1 -DU_STATIC_IMPLEMENTATION -DICU_UTIL_DATA_IMPL=ICU_UTIL_DATA_FILE -DUCHAR_TYPE=uint16_t -DWEBRTC_NON_STATIC_TRACE_EVENT_HANDLERS=0 -DWEBRTC_CHROMIUM_BUILD -DWEBRTC_POSIX -DWEBRTC_LINUX -DABSL_ALLOCATOR_NOTHROW=1 -DWEBRTC_USE_BUILTIN_ISAC_FIX=1 -DWEBRTC_USE_BUILTIN_ISAC_FLOAT=0 -DHAVE_SCTP -DNO_MAIN_THREAD_WRAPPING -DSK_HAS_PNG_LIBRARY -DSK_HAS_WEBP_LIBRARY -DSK_USER_CONFIG_HEADER=\"../../skia/config/SkUserConfig.h\" -DSK_GL -DSK_HAS_JPEG_LIBRARY -DSK_USE_LIBGIFCODEC -DSK_VULKAN_HEADER=\"../../skia/config/SkVulkanConfig.h\" -DSK_VULKAN=1 -DSK_SUPPORT_GPU=1 -DSK_GPU_WORKAROUNDS_HEADER=\"gpu/config/gpu_driver_bug_workaround_autogen.h\" -DVK_NO_PROTOTYPES -DLEVELDB_PLATFORM_CHROMIUM=1 -DLEVELDB_PLATFORM_CHROMIUM=1 -DV8_31BIT_SMIS_ON_64BIT_ARCH -DV8_DEPRECATION_WARNINGS -I../../3rdparty/chromium/skia/config -I../../3rdparty/chromium/third_party -I../../3rdparty/chromium/third_party/boringssl/src/include -I../../3rdparty/chromium/third_party/skia/include/core -Igen -I../../3rdparty/chromium -I/home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/src/core -I/home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/src/core/api -I/home/build/armhf/qt-everywhere-src-5.15.0/qtdeclarative/include/QtQuick/5.15.0 -I/home/build/armhf/qt-everywhere-src-5.15.0/qtdeclarative/include/QtQuick/5.15.0/QtQuick -I/home/build/armhf/qt-everywhere-src-5.15.0/qtbase/include/QtGui/5.15.0 -I/home/build/armhf/qt-everywhere-src-5.15.0/qtbase/include/QtGui/5.15.0/QtGui -I/home/build/armhf/qt-everywhere-src-5.15.0/qtdeclarative/include -I/home/build/armhf/qt-everywhere-src-5.15.0/qtdeclarative/include/QtQuick -I/home/build/armhf/qt-everywhere-src-5.15.0/qtbase/include -I/home/build/armhf/qt-everywhere-src-5.15.0/qtbase/include/QtGui -I/home/build/armhf/qt-everywhere-src-5.15.0/qtdeclarative/include/QtQmlModels/5.15.0 -I/home/build/armhf/qt-everywhere-src-5.15.0/qtdeclarative/include/QtQmlModels/5.15.0/QtQmlModels -I/home/build/armhf/qt-everywhere-src-5.15.0/qtdeclarative/include/QtQml/5.15.0 -I/home/build/armhf/qt-everywhere-src-5.15.0/qtdeclarative/include/QtQml/5.15.0/QtQml -I/home/build/armhf/qt-everywhere-src-5.15.0/qtbase/include/QtCore/5.15.0 -I/home/build/armhf/qt-everywhere-src-5.15.0/qtbase/include/QtCore/5.15.0/QtCore -I/home/build/armhf/qt-everywhere-src-5.15.0/qtdeclarative/include/QtQmlModels -I/home/build/armhf/qt-everywhere-src-5.15.0/qtwebchannel/include -I/home/build/armhf/qt-everywhere-src-5.15.0/qtwebchannel/include/QtWebChannel -I/home/build/armhf/qt-everywhere-src-5.15.0/qtdeclarative/include/QtQml -I/home/build/armhf/qt-everywhere-src-5.15.0/qtbase/include/QtNetwork -I/home/build/armhf/qt-everywhere-src-5.15.0/qtlocation/include -I/home/build/armhf/qt-everywhere-src-5.15.0/qtlocation/include/QtPositioning -I/home/build/armhf/qt-everywhere-src-5.15.0/qtbase/include/QtCore -I/home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/include -I/home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/include/QtWebEngineCore -I/home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/include/QtWebEngineCore/5.15.0 -I/home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/include/QtWebEngineCore/5.15.0/QtWebEngineCore -I.moc -I/home/build/sysroot/opt/vc/include -I/home/build/sysroot/opt/vc/include/interface/vcos/pthreads -I/home/build/sysroot/opt/vc/include/interface/vmcs_host/linux -Igen/.moc -I/home/build/armhf/qt-everywhere-src-5.15.0/qtbase/mkspecs/devices/linux-rasp-pi2-g++ -Igen -Igen -I../../3rdparty/chromium/third_party/libyuv/include -Igen -I../../3rdparty/chromium/third_party/jsoncpp/source/include -I../../3rdparty/chromium/third_party/jsoncpp/generated -Igen -Igen -I../../3rdparty/chromium/third_party/khronos -I../../3rdparty/chromium/gpu -I../../3rdparty/chromium/third_party/vulkan/include -I../../3rdparty/chromium/third_party/perfetto/include -Igen/third_party/perfetto/build_config -Igen -Igen -Igen/third_party/dawn/src/include -I../../3rdparty/chromium/third_party/dawn/src/include -Igen -I../../3rdparty/chromium/third_party/boringssl/src/include -I../../3rdparty/chromium/third_party/protobuf/src -Igen/protoc_out -I../../3rdparty/chromium/third_party/protobuf/src -I../../3rdparty/chromium/third_party/ced/src -I../../3rdparty/chromium/third_party/icu/source/common -I../../3rdparty/chromium/third_party/icu/source/i18n -I../../3rdparty/chromium/third_party/webrtc_overrides -I../../3rdparty/chromium/third_party/webrtc -Igen/third_party/webrtc -I../../3rdparty/chromium/third_party/abseil-cpp -I../../3rdparty/chromium/third_party/skia -I../../3rdparty/chromium/third_party/libgifcodec -I../../3rdparty/chromium/third_party/vulkan/include -I../../3rdparty/chromium/third_party/skia/third_party/vulkanmemoryallocator -I../../3rdparty/chromium/third_party/vulkan/include -Igen/third_party/perfetto -Igen/third_party/perfetto -Igen/third_party/perfetto -Igen/third_party/perfetto -Igen/third_party/perfetto -Igen/third_party/perfetto -Igen/third_party/perfetto -Igen/third_party/perfetto -Igen/third_party/perfetto -Igen/third_party/perfetto -I../../3rdparty/chromium/third_party/crashpad/crashpad -I../../3rdparty/chromium/third_party/crashpad/crashpad/compat/non_mac -I../../3rdparty/chromium/third_party/crashpad/crashpad/compat/linux -I../../3rdparty/chromium/third_party/crashpad/crashpad/compat/non_win -I../../3rdparty/chromium/third_party/libwebm/source -I../../3rdparty/chromium/third_party/leveldatabase -I../../3rdparty/chromium/third_party/leveldatabase/src -I../../3rdparty/chromium/third_party/leveldatabase/src/include -I../../3rdparty/chromium/v8/include -Igen/v8/include -I../../3rdparty/chromium/third_party/mesa_headers -fno-strict-aliasing --param=ssp-buffer-size=4 -fstack-protector -fno-unwind-tables -fno-asynchronous-unwind-tables -fPIC -pipe -pthread -march=armv7-a -mfloat-abi=hard -mtune=generic-armv7-a -mfpu=vfpv3-d16 -mthumb -Wall -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -Wno-psabi -Wno-unused-local-typedefs -Wno-maybe-uninitialized -Wno-deprecated-declarations -fno-delete-null-pointer-checks -Wno-comments -Wno-packed-not-aligned -Wno-dangling-else -Wno-missing-field-initializers -Wno-unused-parameter -O2 -fno-ident -fdata-sections -ffunction-sections -fno-omit-frame-pointer -g0 -fvisibility=hidden -g -O2 -fdebug-prefix-map=/home/build/armhf/qt-everywhere-src-5.15.0=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -O2 -fno-exceptions -Wall -Wextra -D_REENTRANT -I/home/build/sysroot/usr/include/nss -I/home/build/sysroot/usr/include/nspr -std=gnu++14 -Wno-narrowing -Wno-class-memaccess -Wno-attributes -Wno-class-memaccess -Wno-subobject-linkage -Wno-invalid-offsetof -Wno-return-type -Wno-deprecated-copy -fno-exceptions -fno-rtti --sysroot=../../../../../../sysroot/ -fvisibility-inlines-hidden -g -O2 -fdebug-prefix-map=/home/build/armhf/qt-everywhere-src-5.15.0=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -O2 -std=gnu++1y -fno-exceptions -Wall -Wextra -D_REENTRANT -Wno-unused-parameter -Wno-unused-variable -Wno-deprecated-declarations -c /home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/src/core/compositor/display_gl_output_surface.cpp -o obj/QtWebEngineCore/display_gl_output_surface.o
In file included from ../../3rdparty/chromium/gpu/command_buffer/client/gles2_interface.h:8,
                 from ../../3rdparty/chromium/gpu/command_buffer/client/client_transfer_cache.h:15,
                 from ../../3rdparty/chromium/gpu/command_buffer/client/gles2_implementation.h:28,
                 from /home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/src/core/compositor/display_gl_output_surface.cpp:47:
/home/build/sysroot/opt/vc/include/GLES2/gl2.h:78: warning: "GL_FALSE" redefined
 #define GL_FALSE                          (GLboolean)0
In file included from ../../3rdparty/chromium/gpu/command_buffer/client/client_context_state.h:10,
                 from ../../3rdparty/chromium/gpu/command_buffer/client/gles2_implementation.h:27,
                 from /home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/src/core/compositor/display_gl_output_surface.cpp:47:
../../3rdparty/chromium/third_party/khronos/GLES3/gl3.h:85: note: this is the location of the previous definition
 #define GL_FALSE                          0
In file included from ../../3rdparty/chromium/gpu/command_buffer/client/gles2_interface.h:8,
                 from ../../3rdparty/chromium/gpu/command_buffer/client/client_transfer_cache.h:15,
                 from ../../3rdparty/chromium/gpu/command_buffer/client/gles2_implementation.h:28,
                 from /home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/src/core/compositor/display_gl_output_surface.cpp:47:
/home/build/sysroot/opt/vc/include/GLES2/gl2.h:79: warning: "GL_TRUE" redefined
 #define GL_TRUE                           (GLboolean)1
In file included from ../../3rdparty/chromium/gpu/command_buffer/client/client_context_state.h:10,
                 from ../../3rdparty/chromium/gpu/command_buffer/client/gles2_implementation.h:27,
                 from /home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/src/core/compositor/display_gl_output_surface.cpp:47:
../../3rdparty/chromium/third_party/khronos/GLES3/gl3.h:86: note: this is the location of the previous definition
 #define GL_TRUE                           1
In file included from ../../3rdparty/chromium/gpu/command_buffer/client/gles2_interface.h:8,
                 from ../../3rdparty/chromium/gpu/command_buffer/client/client_transfer_cache.h:15,
                 from ../../3rdparty/chromium/gpu/command_buffer/client/gles2_implementation.h:28,
                 from /home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/src/core/compositor/display_gl_output_surface.cpp:47:
/home/build/sysroot/opt/vc/include/GLES2/gl2.h:600:37: error: conflicting declaration of C function  void glShaderSource(GLuint, GLsizei, const GLchar**, const GLint*) 
 GL_APICALL void         GL_APIENTRY glShaderSource (GLuint shader, GLsizei count, const GLchar** string, const GLint* length);
                                     ^~~~~~~~~~~~~~
In file included from ../../3rdparty/chromium/gpu/command_buffer/client/client_context_state.h:10,
                 from ../../3rdparty/chromium/gpu/command_buffer/client/gles2_implementation.h:27,
                 from /home/build/armhf/qt-everywhere-src-5.15.0/qtwebengine/src/core/compositor/display_gl_output_surface.cpp:47:
../../3rdparty/chromium/third_party/khronos/GLES3/gl3.h:624:29: note: previous declaration  void glShaderSource(GLuint, GLsizei, const GLchar* const*, const GLint*) 
 GL_APICALL void GL_APIENTRY glShaderSource (GLuint shader, GLsizei count, const GLchar *const*string, const GLint *length);
                             ^~~~~~~~~~~~~~
cc1plus: warning: unrecognized command line option  -Wno-deprecated-copy 
I'm out of the allocated hour budget, and I'll stop here for now. Building Qt5 has been providing some of the most nightmarish work time in my entire professional life. If my daily job became being required to deal with this kind of insanity, I would strongly invest in a change of career. Update Andreas Gruber wrote:
Long story short, a fast solution for the issue with EGLSetBlobFuncANDROID is to remove libraspberrypi-dev from your sysroot and do a full rebuild. There will be some changes to the configure results, so please review them - if they are relevant for you - before proceeding with your work.
And thanks to Andreas, the story can continue...

6 July 2020

Dirk Eddelbuettel: Rcpp 1.0.5: Several Updates

rcpp logo Right on the heels of the news of 2000 CRAN packages using Rcpp (and also hitting 12.5 of CRAN package, or one in eight), we are happy to announce release 1.0.5 of Rcpp. Since the ten-year anniversary and the 1.0.0 release release in November 2018, we have been sticking to a four-month release cycle. The last release has, however, left us with a particularly bad taste due to some rather peculiar interactions with a very small (but ever so vocal) portion of the user base. So going forward, we will change two things. First off, we reiterate that we have already made rolling releases. Each minor snapshot of the main git branch gets a point releases. Between release 1.0.4 and this 1.0.5 release, there were in fact twelve of those. Each and every one of these was made available via the drat repo, and we will continue to do so going forward. Releases to CRAN, however, are real work. If they then end up with as much nonsense as the last release 1.0.4, we think it is appropriate to slow things down some more so we intend to now switch to a six-months cycle. As mentioned, interim releases are always just one install.packages() call with a properly set repos argument away. Rcpp has become the most popular way of enhancing R with C or C++ code. As of today, 2002 packages on CRAN depend on Rcpp for making analytical code go faster and further, along with 203 in BioConductor. And per the (partial) logs of CRAN downloads, we are running steady at around one millions downloads per month. This release features again a number of different pull requests by different contributors covering the full range of API improvements, attributes enhancements, changes to Sugar and helper functions, extended documentation as well as continuous integration deplayment. See the list below for details.

Changes in Rcpp patch release version 1.0.5 (2020-07-01)
  • Changes in Rcpp API:
    • The exception handler code in #1043 was updated to ensure proper include behavior (Kevin in #1047 fixing #1046).
    • A missing Rcpp_list6 definition was added to support R 3.3.* builds (Davis Vaughan in #1049 fixing #1048).
    • Missing Rcpp_list 2,3,4,5 definition were added to the Rcpp namespace (Dirk in #1054 fixing #1053).
    • A further updated corrected the header include and provided a missing else branch (Mattias Ellert in #1055).
    • Two more assignments are protected with Rcpp::Shield (Dirk in #1059).
    • One call to abs is now properly namespaced with std:: (Uwe Korn in #1069).
    • String object memory preservation was corrected/simplified (Kevin in #1082).
  • Changes in Rcpp Attributes:
    • Empty strings are not passed to R CMD SHLIB which was seen with R 4.0.0 on Windows (Kevin in #1062 fixing #1061).
    • The short_file_name() helper function is safer with respect to temporaries (Kevin in #1067 fixing #1066, and #1071 fixing #1070).
  • Changes in Rcpp Sugar:
    • Two sample() objects are now standard vectors and not R_alloc created (Dirk in #1075 fixing #1074).
  • Changes in Rcpp support functions:
    • Rcpp.package.skeleton() adjusts for a (documented) change in R 4.0.0 (Dirk in #1088 fixing #1087).
  • Changes in Rcpp Documentation:
    • The pdf file of the earlier introduction is again typeset with bibliographic information (Dirk).
    • A new vignette describing how to package C++ libraries has been added (Dirk in #1078 fixing #1077).
  • Changes in Rcpp Deployment:
    • Travis CI unit tests now run a matrix over the versions of R also tested at CRAN (rel/dev/oldrel/oldoldrel), and coverage runs in parallel for a net speed-up (Dirk in #1056 and #1057).
    • The exceptions test is now partially skipped on Solaris as it already is on Windows (Dirk in #1065).
    • The default CI runner was upgraded to R 4.0.0 (Dirk).
    • The CI matrix spans R 3.5, 3.6, r-release and r-devel (Dirk).

Thanks to CRANberries, you can also look at a diff to the previous release. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page. Bugs reports are welcome at the GitHub issue tracker as well (where one can also search among open or closed issues); questions are also welcome under rcpp tag at StackOverflow which also allows searching among the (currently) 2455 previous questions. If you like this or other open-source work I do, you can now sponsor me at GitHub. For the first year, GitHub will match your contributions.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

31 May 2020

Chris Lamb: Free software activities in May 2020

Here is my monthly update covering what I have been doing in the free software world during May 2020 (previous month): In Lintian, the static analysis tool for Debian packages:
Reproducible builds One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security. However, whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into ostensibly secure software during the various compilation and distribution processes. The motivation behind the Reproducible Builds effort is to ensure no flaws have been introduced during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised. The initiative is proud to be a member project of the Software Freedom Conservancy, a not-for-profit 501(c)(3) charity focused on ethical technology and user freedom. Conservancy acts as a corporate umbrella allowing projects to operate as non-profit initiatives without managing their own corporate structure. If you like the work of the Conservancy or the Reproducible Builds project, please consider becoming an official supporter.
Elsewhere in our tooling, I made the following changes to diffoscope, our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues, including preparing and uploading versions 142, 143, 144, 145 and 146 to Debian: I also performed a huge overhaul of diffoscope's website:
Lastly, I made a large number of changes to the main Reproducible Builds website and documentation:
Debian LTS This month I contributed 17 hours to Debian Long Term Support (LTS) and 9 hours on its sister Extended LTS project. You can find out more about the two projects via the following video:
Debian I filed the following bug reports in Debian this month: I also filed a number of bugs against packages that are not compatible with Django 3.x, (organised around a single master bug) including django-taggit, sorl-thumbnail, django-simple-captcha, django-cas-server, django-cors-headers, python-django-csp, django-pipeline, python-django-jsonfield, python-django-contact-form, django-model-utils, django-fsm, django-modeltranslation, django-oauth-toolkit, libthumbor, python-django-extensions, python-django-imagekit, python-django-navtag, python-django-tagging, djangorestframework, django-haystack, django-taggit, django-simple-captcha, python-django-registration, python-django-pyscss, python-django-compressor, python-django-crispy-forms & python-django-mptt,
Lastly, I made the following uploads to Debian: I also sponsored an upload for adminer (4.7.7-1), also uploading it to buster-backports.

21 April 2020

Enrico Zini: Python 2 is dead

It's dead, that's what's wrong with it! It's passed on! This Python is no more! It has ceased to be! It's expired and gone to meet its maker! It's a stiff! Bereft of life, it rests in peace! If you hadn't nailed it to the perch it'd be pushing up the daisies! Its metabolic processes are now 'istory! It's off the twig! It's kicked the bucket, it's shuffled off its mortal coil, run down the curtain and joined the bleedin' choir invisible!! THIS IS AN EX-PYTHON!!

10 April 2020

Dirk Eddelbuettel: Rcpp 1.0.4.6: Bug fix interim version

rcpp logo Rcpp 1.0.4 was released on March 17, following the usual sequence of fairly involved reverse-depends check along with a call for community testing issued weeks before the release. In that email I specifically pleaded with folks to pretty-please test non-standard setups:
It would be particularly beneficial if those with unsual build dependencies tested it as we would increase overall coverage beyond what I get from testing against 1800+ CRAN packages. BioConductor would also be welcome.
Alas, you can t always get what you want. Shortly after the release we were made aware that the two (large) pull request at the book ends of the 1.0.3 to 1.0.4 release period created trouble. Of these two, the earliest PR in the 1.0.4 release upset older-than-CRAN-tested installation, i.e. R 3.3.0 or before. (Why you d want to run R 3.3.* when R 3.6.3 is current is something I will never understand, but so be it.) This got addressed in two new PRs. And the matching last PR had a bit of sloppyness leaving just about everyone alone, but not all those macbook-wearing data scientists when using newer macOS SDKs not used by CRAN. In other words, unsual setups. But boy, do those folks have an ability to complain. Again, two quick PRs later that was addressed. Along came a minor PR with two more Rcpp::Shield<> uses (as life is too short to manually count PROTECT and UNPROTECT). And then a real issue between R 4.0.0 and Rcpp first noticed with RcppParallel builds on Windows but then also affecting RcppArmadillo. Another quickly issued fix. So by now the count is up to six, and we arrived at Rcpp 1.0.4.6. Which is now on CRAN, after having sat there for nearly a full week, and of course with no reason given. Because the powers that be move in mysterious ways. And don t answer to earthlings like us. As may transpire here, I am little tired from all this. I think we can do better, and I think we damn well should, or I may as well throw in the towel and just release to the drat repo where each of the six interim versions was available for all to take as soon as it materialized. Anyway, here is the state of things. Rcpp has become the most popular way of enhancing R with C or C++ code. As of today, 1897 packages on CRAN depend on Rcpp for making analytical code go faster and further, along with 191 in BioConductor. And per the (partial) logs of CRAN downloads, we are running steasy at one millions downloads per month. The changes for this interim version are summarized below.

Changes in Rcpp patch release version 1.0.4.6 (2020-04-02)
  • Changes in Rcpp API:
    • The exception handler code in #1043 was updated to ensure proper include behavior (Kevin in #1047 fixing #1046).
    • A missing Rcpp_list6 definition was added to support R 3.3.* builds (Davis Vaughan in #1049 fixing #1048).
    • Missing Rcpp_list 2,3,4,5 definition were added to the Rcpp namespace (Dirk in #1054 fixing #1053).
    • A further updated corrected the header include and provided a missing else branch (Mattias Ellert in #1055).
    • Two more assignments are protect with Rcpp::Shield (Dirk in #1059)
  • Changes in Rcpp Attributes:
    • Empty strings are not passed to R CMD SHLIB which was seen with R 4.0.0 on Windows (Kevin in #1062 fixing #1061).
  • Changes in Rcpp Deployment:
    • Travis CI unit tests now run a matrix over the versions of R also tested at CRAN (rel/dev/oldrel/oldoldrel), and coverage runs in parallel for a net speed-up (Dirk in #1056 and #1057).

Thanks to CRANberries, you can also look at a diff to the previous release. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page. Bugs reports are welcome at the GitHub issue tracker as well (where one can also search among open or closed issues); questions are also welcome under rcpp tag at StackOverflow which also allows searching among the (currently) 2356 previous questions. If you like this or other open-source work I do, you can now sponsor me at GitHub. For the first year, GitHub will match your contributions.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

6 April 2020

Joachim Breitner: A Telegram bot in Haskell on Amazon Lambda

I just had a weekend full of very successful serious geekery. On a whim I thought: Wouldn't it be nice if people could interact with my game Kaleidogen also via a telegram bot? This led me to learn about how I write a Telegram bot in Haskell and how I can deploy such a Haskell program to Amazon Lambda. In particular the latter bit might be interesting to some of my readers, so here is how went about it.

Kaleidogen Kaleidogen is a little contemplative game (or toy game where, starting from just unicolored disks, you combine abstract circular patterns to breed more interesting patterns. See my FARM 2019 talk for more details, or check out the source repository. BTW, I am looking for help turning it into an Android app!
KaleidogenBot in action

KaleidogenBot in action

Amazon Lambda Amazon Lambda is the Function as a service offering of Amazon Web Services. The idea is that you don t rent a server, where you have to deal with managing the whole system and that you are paying for constantly, but you just upload the code that responds to outside requests, and AWS takes care of the rest: Starting and stopping instances, providing a secure base system etc. When nobody is using the service, no cost occurs. This sounds ideal for hosting a toy Telegram bot: Most of the time nobody will be using it, and I really don't want to have to baby sit yet another service on my server. On Amazon Lambda, I can probably just forget about it. But Haskell is not one of the officially supported languages on Amazon Lambda. So to run Haskell on Lambda, one has to solve two problems:
  • how to invoke the Haskell code on the server, and
  • how to build Haskell so that it runs on the Amazon Linux distribution

A Haskell runtime for Lambda For the first we need a custom runtime. While this sounds complicated, it is actually a pretty simple concept: A runtime is an executable called bootstrap that queries the Lambda Runtime Interface for the next request to handle. The Lambda documentation is phrased as if this runtime has to be a dispatcher that calls the separate function s handler. But it could just do all the things directly. I found the Haskell package aws-lambda-haskell-runtime which provides precisely that: A function
runLambda :: (LambdaOptions -> IO (Either String LambdaResult)) -> IO ()
that talks to the Lambda Runtime API and invokes its argument on each message. The package also provides Template Haskell magic to collect handlers of any JSON-able type and generates a dispatcher, like you might expect from other, more dynamic languages. But that was too much magic for me, so I ignored that and just wrote the handler manually:
main :: IO ()
main = runLambda run
  where
   run ::  LambdaOptions -> IO (Either String LambdaResult)
   run opts = do
    result <- handler (decodeObj (eventObject opts)) (decodeObj (contextObject opts))
    either (pure . Left . encodeObj) (pure . Right . LambdaResult . encodeObj) result
data Event = Event
    path :: T.Text
  , body :: Maybe T.Text
    deriving (Generic, FromJSON)
data Response = Response
    statusCode :: Int
  , headers :: Value
  , body :: T.Text
  , isBase64Encoded :: Bool
    deriving (Generic, ToJSON)
handler :: Event -> Context -> IO (Either String Response)
handler Event body, path  context =
 
I expose my Lambda function to the world via Amazon s API Gateway, configured to just proxy the HTTP requests. This means that my code receives a JSON data structure describing the HTTP request (here called Event, listing only the fields I care about), and it will respond with a Response, again as JSON. The handler can then simply pattern-match on the path to decide what to do. For example this code handles URLs like /img/CAFFEEFACE.png, and responds with an image.
handler :: TC -> Event -> Context -> IO (Either String Response)
handler Event body, path  context
      Just bytes <- isImgPath path >>= T.decodeHex = do
        let pngData = genPurePNG bytes
        pure $ Right Response
              statusCode = 200
            , headers = object [ "Content-Type" .= ("image/png" :: String) ]
            , isBase64Encoded = True
            , body = T.decodeUtf8 $ LBS.toStrict $ Base64.encode pngData
             
     
isImgPath :: T.Text -> Maybe T.Text
isImgPath  = T.stripPrefix "/img/" >=> T.stripSuffix ".png"
If this program would grow more, then one should probably use something more structured for routing here; maybe servant, or bridging towards wai apps (amost like wai-lamda, but that still assumes an existing runtime, instead of simply being the runtime). But for my purposes, no extra layers of indirection or abstraction are needed!

Deploying Haskell to Lambda Building Haskell locally and deploying to different machiens is notoriously tricky; you often end up depending on a shared library that is not available on the other platform. The aws-lambda-haskell-runtime package, and similar projects like serverless-haskell, solve this using stack and Docker two technologies that are probably great, but I never warmed up to them. So instead adding layers and complexities, can I solve this instead my making things simpler? If I compiler my bootstrap into a static Linux binary, it should run on any Linux, including Amazon Linux. Unfortunately, building Haskell programs statically is also notoriously tricky. But it is made much simpler by the work of Niklas Hamb chen and others in the context of the Nix package manager, coordinated in the static-haskell-nix project. The promise here is that once you have set up building your project with Nix, then getting a static version is just one flag away. The support is not completely upstreamed into nixpkgs proper yet, but their repository has a nix file that contains a nixpkgs set with their patches:
let pkgs = (import (sources.nixpkgs-static + "/survey/default.nix")  ).pkgs; in
This, plus a fairly standard nix setup to build the package, yields what I was hoping for:
$ nix-build -A kaleidogen
/nix/store/ppwyq4d964ahd6k56wsklh93vzw07ln0-kaleidogen-0.1.0.0
$ file result/bin/kaleidogen-amazon-lambda
result/bin/kaleidogen-amazon-lambda: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, stripped
$ ls -sh result/bin/kaleidogen-amazon-lambda
6,7M result/bin/kaleidogen-amazon-lambda
If we put this file, named bootstrap, into a zip file and upload it to Amazon Lambda, then it just works! Creating the zip file is easily scripted using nix:
  function-zip = pkgs.runCommandNoCC "kaleidogen-lambda"  
    buildInputs = [ pkgs.zip ];
    ''
    mkdir -p $out
    cp $ kaleidogen /bin/kaleidogen-amazon-lambda bootstrap
    zip $out/function.zip bootstrap
  '';
So to upload this, I use this one-liner (line-wrapped for your convenience):
nix-build -A function-zip &&
aws lambda update-function-code --function-name kaleidogen \
  --zip-file fileb://result/function.zip
Thanks to how Nix pins all dependencies, I am fairly confident that I can return to this project in 4 months and still be able to build it. Of course, I want continuous integration and deployment. So I build the project with GitHub Actions, using a cachix nix cache to significantly speed up the build, and auto-deploy to Lambda using aws-lambda-deploy; see my workflow file for details.

The Telegram part The above allows me to run basically any stateless service, and a Telegram bot is nothing else: When configured to act as a WebHook, Telegram will send a request with a message to our Lambda function, where we can react on it. The telegram-api package provides bindigs for the Telegram Bot API (although I had to use the repository version, as the version on Hackage has some bitrot). Slightly simplified I can write a handler for an Update:
handleUpdate :: Update -> TelegramClient ()
handleUpdate Update  message = Just m   = do
  let c = ChatId (chat_id (chat m))
  liftIO $ printf "message from %s: %s\n" (maybe "?" user_first_name (from m)) (maybe "" T.unpack (text m))
  if "/start"  T.isPrefixOf  fromMaybe "" (text m)
  then do
    rm <- sendMessageM $ sendMessageRequest c "Hi! I am @KaleidogenBot.  "
    return ()
  else do
    m1 <- sendMessageM $ sendMessageRequest c "One moment "
    withPNGFile  $ \pngFN -> do
      m2 <- uploadPhotoM $ uploadPhotoRequest c
        (FileUpload (Just "image/png") (FileUploadFile pngFN))
      return ()
handleUpdate _ u =
  liftIO $ putStrLn $ "Unhandled message: " ++ show u
and call this from the handler that I wrote above:
     
      path == "/telegram" =
      case eitherDecode (LBS.fromStrict (T.encodeUtf8 (fromMaybe "" body))) of
        Left err ->  
        Right update -> do
          runTelegramClient token manager $ handleUpdate Nothing update
          pure $ Right Response
              statusCode = 200
            , headers = object [ "Content-Type" .= ("text/plain" :: String) ]
            , isBase64Encoded = False
            , body = "Done"
             
     
Note that the Lambda code receives the request as JSON data structure with a body that contains the original HTTP request body. Which, in this case, is itself JSON, so we have to decode that. All that is left to do is to tell Telegram where this code lives:
curl --request POST \
  --url https://api.telegram.org/bot<token>/setWebhook
  --header 'content-type: application/json'
  --data ' "url": "https://api.kaleidogen.nomeata.de/telegram" '
As a little add on, I also created a Telegram game for Kaleidogen. A Telegram game is nothing but a webpage that runs inside Telegram, so it wasn t much work to wrap the Web version of Kaleidogen that way, but the resulting Telegram game (which you can access via https://core.telegram.org/bots/games) still looks pretty neat.

No /dev/dri/renderD128 I am mostly happy with this setup: My game is now available to more people in more ways. I don t have to maintain any infrastructure. When nobody is using this bot no resources are wasted, and the costs of the service are neglectible -- this is unlikely to go beyond the free tier, and even if it would, the cost per generated image is roughly USD 0.000021. There is one slight disappointment, though. What I find most intersting about Kaleidogen from a technical point of view is that when you play it in the browser, the images are not generated by my code. Instead, my code creates a WebGL shader program on the fly, and that program generates the image on your graphics card. I even managed to make the GL rendering code work headlessly, i.e. from a command line program, using EGL and libgbm and a helper written in C. But it needs access to a graphics card via /dev/dri/renderD128. Amazon does not provide that to Lambda code, and neither do the other big Function-as-a-service providers. So I had to swallow my pride and reimplement the rendering in pure Haskell. So if you think the bot is kinda slow, then that s why. Despite properly optimizing the pure implementation (the inner loop does not do allocations and deals only with unboxed Double# values), the GL shader version is still three times as fast. Maybe in a few years GPU access will be so ubiquitous that it s even on Amazon Lambda; then I can easily use that.

17 March 2020

Dirk Eddelbuettel: Rcpp 1.0.4: Lots of goodies

rcpp logo The fourth maintenance release 1.0.4 of Rcpp, following up on the 10th anniversary and the 1.0.0. release sixteen months ago, arrived on CRAN this morning. This follows a few days of gestation at CRAN. To help during the wait we provided this release via drat last Friday. And it followed a pre-release via drat a week earlier. But now that the release is official, Windows and macOS binaries will be built by CRAN over the next few days. The corresponding Debian package will be uploaded as a source package shortly after which binaries can be built. As with the previous releases Rcpp 1.0.1, Rcpp 1.0.2 and Rcpp 1.0.3, we have the predictable and expected four month gap between releases which seems appropriate given both the changes still being made (see below) and the relative stability of Rcpp. It still takes work to release this as we run multiple extensive sets of reverse dependency checks so maybe one day we will switch to six month cycle. For now, four months still seem like a good pace. Rcpp has become the most popular way of enhancing R with C or C++ code. As of today, 1873 packages on CRAN depend on Rcpp for making analytical code go faster and further, along with 191 in BioConductor. And per the (partial) logs of CRAN downloads, we are running steasy at one millions downloads per month. This release features quite a number of different pull requests by seven different contributors as detailed below. One (personal) highlight is the switch to tinytest.

Changes in Rcpp version 1.0.4 (2020-03-13)
  • Changes in Rcpp API:
    • Safer Rcpp_list*, Rcpp_lang* and Function.operator() (Romain in #1014, #1015).
    • A number of #nocov markers were added (Dirk in #1036, #1042 and #1044).
    • Finalizer calls clear external pointer first (Kirill M ller and Dirk in #1038).
    • Scalar operations with a rhs matrix no longer change the matrix value (Qiang in #1040 fixing (again) #365).
    • Rcpp::exception and Rcpp::stop are now more thread-safe (Joshua Pritikin in #1043).
  • Changes in Rcpp Attributes:
    • The cppFunction helper now deals correctly with mulitple depends arguments (TJ McKinley in #1016 fixing #1017).
    • Invisible return objects are now supported via new option (Kun Ren in #1025 fixing #1024).
    • Unavailable packages referred to in LinkingTo are now reported (Dirk in #1027 fixing #1026).
    • The sourceCpp function can now create a debug DLL on Windows (Dirk in #1037 fixing #1035).
  • Changes in Rcpp Documentation:
    • The .github/ directory now has more explicit guidance on contributing, issues, and pull requests (Dirk).
    • The Rcpp Attributes vignette describe the new invisible return object option (Kun Ren in #1025).
    • Vignettes are now included as pre-made pdf files (Dirk in #1029)
    • The Rcpp FAQ has a new entry on the recommended importFrom directive (Dirk in #1031 fixing #1030).
    • The bib file for the vignette was once again updated to current package versions (Dirk).
  • Changes in Rcpp Deployment:
    • Added unit test to check if C++ version remains remains aligned with the package number (Dirk in #1022 fixing #1021).
    • The unit test system was switched to tinytest (Dirk in #1028, #1032, #1033).

Please note that the change to execptions and Rcpp::stop() in pr #1043 has been seen to have a minor side effect on macOS issue #1046 which has already been fixed by Kevin in pr #1047 for which I may prepare a 1.0.4.1 release for the Rcpp drat repo in a day or two. Thanks to CRANberries, you can also look at a diff to the previous release. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page. Bugs reports are welcome at the GitHub issue tracker as well (where one can also search among open or closed issues); questions are also welcome under rcpp tag at StackOverflow which also allows searching among the (currently) 2356 previous questions. If you like this or other open-source work I do, you can now sponsor me at GitHub. For the first year, GitHub will match your contributions.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

17 October 2017

Russ Allbery: Bundle haul

Confession time: I started making these posts (eons ago) because a close friend did as well, and I enjoyed reading them. But the main reason why I continue is because the primary way I have to keep track of the books I've bought and avoid duplicates is, well, grep on these posts. I should come up with a non-bullshit way of doing this, but time to do more elegant things is in short supply, and, well, it's my blog. So I'm boring all of you who read this in various places with my internal bookkeeping. I do try to at least add a bit of commentary. This one will be more tedious than most since it includes five separate Humble Bundles, which increases the volume a lot. (I just realized I'd forgotten to record those purchases from the past several months.) First, the individual books I bought directly: Ilona Andrews Sweep in Peace (sff)
Ilona Andrews One Fell Sweep (sff)
Steven Brust Vallista (sff)
Nicky Drayden The Prey of Gods (sff)
Meg Elison The Book of the Unnamed Midwife (sff)
Pat Green Night Moves (nonfiction)
Ann Leckie Provenance (sff)
Seanan McGuire Once Broken Faith (sff)
Seanan McGuire The Brightest Fell (sff)
K. Arsenault Rivera The Tiger's Daughter (sff)
Matthew Walker Why We Sleep (nonfiction)
Some new books by favorite authors, a few new releases I heard good things about, and two (Night Moves and Why We Sleep) from references in on-line articles that impressed me. The books from security bundles (this is mostly work reading, assuming I'll get to any of it), including a blockchain bundle: Wil Allsop Unauthorised Access (nonfiction)
Ross Anderson Security Engineering (nonfiction)
Chris Anley, et al. The Shellcoder's Handbook (nonfiction)
Conrad Barsky & Chris Wilmer Bitcoin for the Befuddled (nonfiction)
Imran Bashir Mastering Blockchain (nonfiction)
Richard Bejtlich The Practice of Network Security (nonfiction)
Kariappa Bheemaiah The Blockchain Alternative (nonfiction)
Violet Blue Smart Girl's Guide to Privacy (nonfiction)
Richard Caetano Learning Bitcoin (nonfiction)
Nick Cano Game Hacking (nonfiction)
Bruce Dang, et al. Practical Reverse Engineering (nonfiction)
Chris Dannen Introducing Ethereum and Solidity (nonfiction)
Daniel Drescher Blockchain Basics (nonfiction)
Chris Eagle The IDA Pro Book, 2nd Edition (nonfiction)
Nikolay Elenkov Android Security Internals (nonfiction)
Jon Erickson Hacking, 2nd Edition (nonfiction)
Pedro Franco Understanding Bitcoin (nonfiction)
Christopher Hadnagy Social Engineering (nonfiction)
Peter N.M. Hansteen The Book of PF (nonfiction)
Brian Kelly The Bitcoin Big Bang (nonfiction)
David Kennedy, et al. Metasploit (nonfiction)
Manul Laphroaig (ed.) PoC GTFO (nonfiction)
Michael Hale Ligh, et al. The Art of Memory Forensics (nonfiction)
Michael Hale Ligh, et al. Malware Analyst's Cookbook (nonfiction)
Michael W. Lucas Absolute OpenBSD, 2nd Edition (nonfiction)
Bruce Nikkel Practical Forensic Imaging (nonfiction)
Sean-Philip Oriyano CEHv9 (nonfiction)
Kevin D. Mitnick The Art of Deception (nonfiction)
Narayan Prusty Building Blockchain Projects (nonfiction)
Prypto Bitcoin for Dummies (nonfiction)
Chris Sanders Practical Packet Analysis, 3rd Edition (nonfiction)
Bruce Schneier Applied Cryptography (nonfiction)
Adam Shostack Threat Modeling (nonfiction)
Craig Smith The Car Hacker's Handbook (nonfiction)
Dafydd Stuttard & Marcus Pinto The Web Application Hacker's Handbook (nonfiction)
Albert Szmigielski Bitcoin Essentials (nonfiction)
David Thiel iOS Application Security (nonfiction)
Georgia Weidman Penetration Testing (nonfiction)
Finally, the two SF bundles: Buzz Aldrin & John Barnes Encounter with Tiber (sff)
Poul Anderson Orion Shall Rise (sff)
Greg Bear The Forge of God (sff)
Octavia E. Butler Dawn (sff)
William C. Dietz Steelheart (sff)
J.L. Doty A Choice of Treasons (sff)
Harlan Ellison The City on the Edge of Forever (sff)
Toh Enjoe Self-Reference ENGINE (sff)
David Feintuch Midshipman's Hope (sff)
Alan Dean Foster Icerigger (sff)
Alan Dean Foster Mission to Moulokin (sff)
Alan Dean Foster The Deluge Drivers (sff)
Taiyo Fujii Orbital Cloud (sff)
Hideo Furukawa Belka, Why Don't You Bark? (sff)
Haikasoru (ed.) Saiensu Fikushon 2016 (sff anthology)
Joe Haldeman All My Sins Remembered (sff)
Jyouji Hayashi The Ouroboros Wave (sff)
Sergei Lukyanenko The Genome (sff)
Chohei Kambayashi Good Luck, Yukikaze (sff)
Chohei Kambayashi Yukikaze (sff)
Sakyo Komatsu Virus (sff)
Miyuki Miyabe The Book of Heroes (sff)
Kazuki Sakuraba Red Girls (sff)
Robert Silverberg Across a Billion Years (sff)
Allen Steele Orbital Decay (sff)
Bruce Sterling Schismatrix Plus (sff)
Michael Swanwick Vacuum Flowers (sff)
Yoshiki Tanaka Legend of the Galactic Heroes, Volume 1: Dawn (sff)
Yoshiki Tanaka Legend of the Galactic Heroes, Volume 2: Ambition (sff)
Yoshiki Tanaka Legend of the Galactic Heroes, Volume 3: Endurance (sff)
Tow Ubukata Mardock Scramble (sff)
Sayuri Ueda The Cage of Zeus (sff)
Sean Williams & Shane Dix Echoes of Earth (sff)
Hiroshi Yamamoto MM9 (sff)
Timothy Zahn Blackcollar (sff)
Phew. Okay, all caught up, and hopefully won't have to dump something like this again in the near future. Also, more books than I have any actual time to read, but what else is new.

16 October 2017

Gustavo Noronha Silva: Who knew we still had low-hanging fruits?

Earlier this month I had the pleasure of attending the Web Engines Hackfest, hosted by Igalia at their offices in A Coru a, and also sponsored by my employer, Collabora, Google and Mozilla. It has grown a lot and we had many new people this year. Fun fact: I am one of the 3 or 4 people who have attended all of the editions of the hackfest since its inception in 2009, when it was called WebKitGTK+ hackfest \o/ 20171002_204405 It was a great get together where I met many friends and made some new ones. Had plenty of discussions, mainly with Antonio Gomes and Google s Robert Kroeger, about the way forward for Chromium on Wayland. We had the opportunity of explaining how we at Collabora cooperated with igalians to implemented and optimise a Wayland nested compositor for WebKit2 to share buffers between processes in an efficient way even on broken drivers. Most of the discussions and some of the work that led to this was done in previous hackfests, by the way! 20171002_193518 The idea seems to have been mostly welcomed, the only concern being that Wayland s interfaces would need to be tested for security (fuzzed). So we may end up going that same route with Chromium for allowing process separation between the UI and GPU (being renamed Viz, currently) processes. On another note, and going back to the title of the post, at Collabora we have recently adopted Mattermost to replace our internal IRC server. Many Collaborans have decided to use Mattermost through an Epiphany Web Application or through a simple Python application that just shows a GTK+ window wrapping a WebKitGTK+ WebView. 20171002_101952 Some people noticed that when the connection was lost Mattermost would take a very long time to notice and reconnect its web sockets were taking a long, long time to timeout, according to our colleague Andrew Shadura. I did some quick searching on the codebase and noticed WebCore has a NetworkStateNotifier interface that it uses to get notified when connection changes. That was not implemented for WebKitGTK+, so it was likely what caused stuff to linger when a connection hiccup happened. Given we have GNetworkMonitor, implementation of the missing interfaces required only 3 lines of actual code (plus the necessary boilerplate)! screenshot-from-2017-10-16-11-13-39 I was surprised to still find such as low hanging fruit in WebKitGTK+, so I decided to look for more. Turns out WebCore also has a notifier for low power situations, which was implemented only by the iOS port, and causes the engine to throttle some timers and avoid some expensive checks it would do in normal situations. This required a few more lines to implement using upower-glib, but not that many either! That was the fun I had during the hackfest in terms of coding. Mostly I had fun just lurking in break out sessions discussing the past, present and future of tech such as WebRTC, Servo, Rust, WebKit, Chromium, WebVR, and more. I also beat a few challengers in Street Fighter 2, as usual. I d like to say thanks to Collabora, Igalia, Google, and Mozilla for sponsoring and attending the hackfest. Thanks to Igalia for hosting and to Collabora for sponsoring my attendance along with two other Collaborans. It was a great hackfest and I m looking forward to the next one! See you in 2018 =)

3 October 2017

Dimitri John Ledkov: An interesting bug - network-manager, glibc, dpkg-shlibdeps, systemd, and finally binutils

Not so long ago I went to effectively recompile NetworkManager and fix up minor bug in it. It built fine across all architectures, was considered to be installable etc. And I was expecting it to just migrate across. At the time, glibc was at 2.26 in artful-proposed and NetworkManager was built against it. However release pocket was at glibc 2.24. In Ubuntu we have a ProposedMigration process in place which ensures that newly built packages do not regress in the number of architectures built for; installable on; and do not regress themselves or any reverse dependencies at runtime.

Thus before my build of NetworkManager was considered for migration, it was tested in the release pocket against packages in the release pocket. Specifically, since package metadata only requires glibc 2.17 NetworkManager was tested against glibc currently in the release pocket, which should just work fine....
autopkgtest [21:47:38]: test nm: [-----------------------
test_auto_ip4 (__main__.ColdplugEthernet)
ethernet: auto-connection, IPv4 ... FAIL ----- NetworkManager.log -----
NetworkManager: /lib/x86_64-linux-gnu/libc.so.6: version GLIBC_2.25' not found (required by NetworkManager)
At first I only saw failing tests, which I thought is transient failure. Thus they were retried a few times. Then I looked at the autopkgtest log and saw above error messages. Perplexed, I have started a lxd container with ubuntu artful, enabled proposed and installed just network-manager from artful-proposed and indeed a simple NetworkManager --help failed with above error from linker.

I am too young to know what dependency-hell means, since ever since I used Linux (Ubuntu 7.04) all glibc symbols were versioned, and dpkg-shlibdeps would generate correct minimum dependencies for a package. Alas in this case readelf confirmed that indeed /usr/sbin/NetworkManager requires 2.25 and dpkg depends is >= 2.17.

Further reading readelf output I checked that all of the glibc symbols used are 2.17 or lower, and only the "Version needs section '.gnu.version_r'" referenced GLIBC_2.25 symbol. Inspecting dpkg-shlibdeps code I noticed that it does not parse that section and only searches through the dynamic symbols used to establish the minimum required version.

Things started to smell fishy. On one hand, I trust dpkg-shlibdeps to generate the right dependencies. On the other hand I also trust linker to not tell lies either. Hence I opened a Debian BTS bug report about this issue.

At this point, I really wanted to figure out where the reference to 2.25 comes from. Clearly it was not from any private symbols as then the reference would be on 2.26. Checking glibc abi lists I found there were only a handful of symbols marked as 2.25
$ grep 2.25 ./sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
GLIBC_2.25 GLIBC_2.25 A
GLIBC_2.25 __explicit_bzero_chk F
GLIBC_2.25 explicit_bzero F
GLIBC_2.25 getentropy F
GLIBC_2.25 getrandom F
GLIBC_2.25 strfromd F
GLIBC_2.25 strfromf F
GLIBC_2.25 strfroml F
Blindly grepping for these in network-manager source tree I found following:
$ grep explicit_bzero -r configure.ac src/
configure.ac: explicit_bzero],
src/systemd/src/basic/string-util.h:void explicit_bzero(void *p, size_t l);
src/systemd/src/basic/string-util.c:void explicit_bzero(void *p, size_t l)
src/systemd/src/basic/string-util.c: explicit_bzero(x, strlen(x));
First of all it seems like network-manager includes a partial embedded copy of systemd. Secondly that code is compiled into a temporary library and has autconf detection logic to use explicit_bzero. It also has an embedded implementation of explicit_bzero when it is not available in libc, however it does not have FORTIFY_SOURCES implementation of said function (__explicit_bzero_chk) as was later pointed out to me. And whilst this function is compiled into an intermediary noinst library, no functions that use explicit_bzero are used in the end by NetworkManger binary. To proof this, I've dropped all code that uses explicit_bzero, rebuild the package against glibc 2.26, and voila it only had Version reference on glibc 2.17 as expected from the end-result usage of shared symbols.

At this point toolchain bug was a suspect. It seems like whilst explicit_bzero shared symbol got optimised out, the version reference on 2.25 persisted to the linked binaries. At this point in the archive a snapshot version of binutils was in use. And in fact forcefully downgrading bintuils resulted in correct compilation / versions table referencing only glibc 2.17.

Mathias then took over a tarball of object files and filed upstream bug report against bintuils: "[2.29 Regression] ld.bfd keeps a version reference in .gnu.version_r for symbols which are optimized out". The discussion in that bug report is a bit beyond me as to me binutils is black magic. All I understood there was "we moved sweep and pass to another place due to some bugs", doing that introduced this bug, thus do multiple sweep and passes to make sure we fix old bugs and don't regress this either. Or something like that. Comments / Better description of the bintuils fix are welcomed.

Binutils got fixed by upstream developers, cherry-picked into debian, and ubuntu, network-manager got rebuild and everything is wonderful now. However, it does look like unused / deadend code paths tripped up optimisations in the toolchain which managed to slip by distribution package dependency generation and needless require a higher up version of glibc. I guess the lesson here is do not embed/compile unused code. Also I'm not sure why network-manager uses networkd internals like this, and maybe systemd should expose more APIs or serialise more state into /run, as most other things query things over dbus, private socket, or by establishing watches on /run/systemd/netif. I'll look into that another day.

Thanks a lot to Guillem Jover, Matthias Klose, Alan Modra, H.J. Lu, and others for getting involved. I would not be able to raise, debug, or fix this issue all by myself.

13 September 2017

Reproducible builds folks: Reproducible Builds: Weekly report #124

Here's what happened in the Reproducible Builds effort between Sunday September 3 and Saturday September 9 2017: Media coverage GSoC and Outreachy updates Debian will participate in this year's Outreachy initiative and the Reproducible Builds is soliciting mentors and students to join this round. For more background please see the following mailing list posts: 1, 2 & 3. Reproduciblity work in Debian In addition, the following NMUs were accepted: Reproduciblity work in other projects Patches sent upstream: Packages reviewed and fixed, and bugs filed Reviews of unreproducible packages 3 package reviews have been added, 2 have been updated and 2 have been removed in this week, adding to our knowledge about identified issues. Weekly QA work During our reproducibility testing, FTBFS bugs have been detected and reported by: diffoscope development Development continued in git, including the following contributions: Mattia Rizzolo also uploaded the version 86 released last week to stretch-backports. reprotest development tests.reproducible-builds.org Misc. This week's edition was written by Bernhard M. Wiedemann, Chris Lamb, Holger Levsen, Mattia Rizzolo & reviewed by a bunch of Reproducible Builds folks on IRC & the mailing lists.

13 August 2017

Enrico Zini: Consensually doing things together?

On 2017-08-06 I have a talk at DebConf17 in Montreal titled "Consensually doing things together?" (video). Here are the talk notes. Abstract At DebConf Heidelberg I talked about how Free Software has a lot to do about consensually doing things together. Is that always true, at least in Debian? I d like to explore what motivates one to start a project and what motivates one to keep maintaining it. What are the energy levels required to manage bits of Debian as the project keeps growing. How easy it is to say no. Whether we have roles in Debian that require irreplaceable heroes to keep them going. What could be done to make life easier for heroes, easy enough that mere mortals can help, or take their place. Unhappy is the community that needs heroes, and unhappy is the community that needs martyrs. I d like to try and make sure that now, or in the very near future, Debian is not such an unhappy community. Consensually doing things together I gave a talk in Heidelberg. Valhalla made stickers Debian France distributed many of them. There's one on my laptop. Which reminds me of what we ought to be doing. Of what we have a chance to do, if we play our cards right. I'm going to talk about relationships. Consensual relationships. Relationships in short. Nonconsensual relationships are usually called abuse. I like to see Debian as a relationship between multiple people. And I'd like it to be a consensual one. I'd like it not to be abuse. Consent From wikpedia:
In Canada "consent means the voluntary agreement of the complainant to engage in sexual activity" without abuse or exploitation of "trust, power or authority", coercion or threats.[7] Consent can also be revoked at any moment.[8] There are 3 pillars often included in the description of sexual consent, or "the way we let others know what we're up for, be it a good-night kiss or the moments leading up to sex." They are:
  • Knowing exactly what and how much I'm agreeing to
  • Expressing my intent to participate
  • Deciding freely and voluntarily to participate[20]
Saying "I've decided I won't do laundry anymore" when the other partner is tired, or busy doing things. Is different than saying "I've decided I won't do laundry anymore" when the other partner has a chance to say "why? tell me more" and take part in negotiation. Resources: Relationships Debian is the Universal Operating System. Debian is made and maintained by people. The long term health of debian is a consequence of the long term health of the relationship between Debian contributors. Debian doesn't need to be technically perfect, it needs to be socially healthy. Technical problems can be fixed by a healty community. graph showing relationship between avoidance, accomodation, compromise, competition, collaboration The Thomas-Kilmann Conflict Mode Instrument: source png. Motivations Quick poll: What are your motivations to be in a relationship? Which of those motivations are healthy/unhealthy? "Galadriel" (noun, by Francesca Ciceri): a task you have to do otherwise Sauron takes over Middle Earth See: http://blog.zouish.org/nonupdd/#/22/1 What motivates me to start a project or pick one up? What motivates me to keep maintaning a project? What motivates you? What's an example of a sustainable motivation? Is it really all consensual in Debian? Energy Energy that thing which is measured in spoons. The metaphore comes from people suffering with chronic health issues:
"Spoons" are a visual representation used as a unit of measure used to quantify how much energy a person has throughout a given day. Each activity requires a given number of spoons, which will only be replaced as the person "recharges" through rest. A person who runs out of spoons has no choice but to rest until their spoons are replenished.
For example, in Debian, I could spend: What is one person capable of doing? Have reasonable expectations, on others: Have reasonable expectations, on yourself: Debian is a shared responsibility When spoons are limited, what takes more energy tends not to get done As the project grows, project-wide tasks become harder Are they still humanly achievable? I don't want Debian to have positions that require hero-types to fill them Dictatorship of who has more spoons: Perfectionism You are in a relationship that is just perfect. All your friends look up to you. You give people relationship advice. You are safe in knowing that You Are Doing It Right. Then one day you have an argument in public. You don't just have to deal with the argument, but also with your reputation and self-perception shattering. One things I hate about Debian: consistent technical excellence. I don't want to be required to always be right. One of my favourite moments in the history of Debian is the openssl bug Debian doesn't need to be technically perfect, it needs to be socially healthy, technical problems can be fixed. I want to remove perfectionism from Debian: if we discover we've been wrong all the time in something important, it's not the end of Debian, it's the beginning of an improved Debian. Too good to be true There comes a point in most people's dating experience where one learns that when some things feel too good to be true, they might indeed be. There are people who cannot say no: There are people who cannot take a no: Note the diversity statement: it's not a problem to have one of those (and many other) tendencies, as long as one manages to keep interacting constructively with the rest of the community Also, it is important to be aware of these patterns, to be able to compensate for one's own tendencies. What happens when an avoidant person meets a narcissistic person, and they are both unaware of the risks? Resources: Note: there are problems with the way these resources are framed: Red flag / green flag http://pervocracy.blogspot.ca/2012/07/green-flags.html Ask for examples of red/green flags in Debian. Green flags: Red flags: Apologies / Dealing with issues I don't see the usefulness of apologies that are about accepting blame, or making a person stop complaining. I see apologies as opportunities to understand the problem I caused, help fix it, and possibly find ways of avoiding causing that problem again in the future. A Better Way to Say Sorry lists a 4 step process, which is basically what we do when in bug reports already: 1, Try to understand and reproduce the exact problem the person had. 2. Try to find the cause of the issue. 3. Try to find a solution for the issue. 4. Verify with the reporter that the solution does indeed fix the issue. This is just to say
My software ate
the files
that where in
your home directory and which
you were probably
needing
for work Forgive me
it was so quick to write
without tests
and it worked so well for me
(inspired by a 1934 poem by William Carlos Williams) Don't be afraid to fail Don't be afraid to fail or drop the ball. I think that anything that has a label attached of "if you don't do it, nobody will", shouldn't fall on anybody's shoulders and should be shared no matter what. Shared or dropped. Share the responsibility for a healthy relationship Don't expect that the more experienced mates will take care of everything. In a project with active people counted by the thousand, it's unlikely that harassment isn't happening. Is anyone writing anti-harassment? Do we have stats? Is having an email address and a CoC giving us a false sense of security?
When you get involved in a new community, such as Debian, find out early where, if that happens, you can find support, understanding, and help to make it stop. If you cannot find any, or if the only thing you can find is people who say "it never happens here", consider whether you really want to be in that community.
(from http://www.enricozini.org/blog/2016/debian/you-ll-thank-me-later/)
There are some nice people in the world. I mean nice people, the sort I couldn t describe myself as. People who are friends with everyone, who are somehow never involved in any argument, who seem content to spend their time drawing pictures of bumblebees on flowers that make everyone happy. Those people are great to have around. You want to hold onto them as much as you can. But people only have so much tolerance for jerkiness, and really nice people often have less tolerance than the rest of us. The trouble with not ejecting a jerk whether their shenanigans are deliberate or incidental is that you allow the average jerkiness of the community to rise slightly. The higher it goes, the more likely it is that those really nice people will come around less often, or stop coming around at all. That, in turn, makes the average jerkiness rise even more, which teaches the original jerk that their behavior is acceptable and makes your community more appealing to other jerks. Meanwhile, more people at the nice end of the scale are drifting away.
(from https://eev.ee/blog/2016/07/22/on-a-technicality/) Give people freedom If someone tries something in Debian, try to acknowledge and accept their work. You can give feedback on what they are doing, and try not to stand in their way, unless what they are doing is actually hurting you. In that case, try to collaborate, so that you all can get what you need. It's ok if you don't like everything that they are doing. I personally don't care if people tell me I'm good when I do something, I perceive it a bit like "good boy" or "good dog". I rather prefer if people show an interest, say "that looks useful" or "how does it work?" or "what do you need to deploy this?" Acknowledge that I've done something. I don't care if it's especially liked, give me the freedom to keep doing it. Don't give me rewards, give me space and dignity. Rather than feeding my ego, feed by freedom, and feed my possibility to create.

Mike Gabriel: @DebConf17: Work for Debian and FLOSS I got done during DebCamp and DebConf... and Beyond...

People I Met and will Remember Topics I have worked on Talks and BoFs Packages Uploaded to Debian unstable Packages Uploaded to Debian NEW I also looked into lightdm-webkit2-greeter, but upstream is in the middle of a transition from Gtk3 to Qt5, so this has been suspended for now. Packages Uploaded to oldstable-/stable-proposed-updates or -security Other Package related Stuff Thanks to Everyone Making This Event Possible A big thanks to everyone who made it possible for me to attend this event!!!

Next.

Previous.